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REMARKS 

Reconsideration of the present application is respectfully requested. Claims 
2-10, and 12-15 are pending. Claim 11 has been cancelled as belonging to a non- 
elected invention. The right to pursue this claim in a continuing application is 
reserved. No change of inventorship is necessary. Claim 1 has been cancelled and 
rewritten as new claims 12-15. Support for these claims is found in the claims as 
originally filed, and throughout the specification. In particular, see page 32, lines 15- 
18 regarding new claim 15. No new matter has been added. 

Claims 2, 7, 9, and 10 have been amended. 

Claim 2 has been amended to have proper antecedent basis. Claim 1 has 
been cancelled and rewritten as new claims 12-15. Claim 2 has been amended to 
properly depend from new claim 12. Applicant has also deleted the phrase "in sense 
or anti-sense orientation". Support for these amendments can be found in the 
specification and claims as originally filed, in particular see page 12, lines 20-23, and 
page 44, lines 3-18. 

Claim 7 has been amended to have proper Markush format, as recommended 
by the Examiner. 

Claim 9 has been amended to have proper antecedent basis. Claim 9 has 
been amended to depend from new claim 12. Applicant has changed the gene 
name from "RAD51" to "RAD51C" to clarify the Rad51-like sequences of the claimed 
invention. Applicant has also deleted the term "maize" in reference to the 
sequences of the invention. Support for these amendments can be found in the 
specification and claims as originally filed, in particular see page 4, lines 9-11. 

Claim 10 has been amended to include more plants which can be used in the 
method of claim 9. Support for this amendment can be found in the specification 
and the claims as originally filed. 

Applicant has amended the specification to delete references to internet 
hyperlinks. 
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Applicant has amended the abstract to recite "RAD51C" to clarify the 
sequences of the invention. Applicant has deleted the term "maize" in reference to 
the sequences of the invention. Support for these amendments can be found in the 
specification and claims as originally filed, in particular see page 4, lines 9-1 1 . 

Applicant includes a new Declaration executed by the inventors which 
includes the ZIP Code designation for each post office address. 

The marked up version of these amendments is found on a separate sheet 
attached to this amendment and titled " Version with Markings to Show Changes 
Made. " It is respectfully requested that the amendments be entered. 

Election/Restriction 

The Examiner has issued a restriction requirement, and has required election 
of either the invention of Group I (Claims 1-10) or Group II (Claim 11). Applicants 
hereby affirm the provisional election to prosecute the claims of Group I, with 
traverse, and expressly reserves the right to file a divisional application relating to 
and claiming the invention of Group II. No change of inventorship is required due to 
this election of Group I. 

The Examiner further required election of one sequence for the application. 
The claims have been amended to remove reference to SEQ ID NOS: 3 and 5 as 
per the election filed September 5, 2001. The Applicants traverse the restriction 
requirement and therefore respectfully request reconsideration of the same. The 
Applicants submit the alignments referred to in the response to the restriction 
requirement filed September 5, 2001 in Appendix A. These alignments demonstrate 
the high degree of homology between SEQ ID NOS: 1, 3, and 5, as well as the 
encoded proteins of SEQ ID NOS: 2, 4, and 6. The polynucleotides of the present 
invention, as shown in Appendix A, share greater than 99% sequence identity as 
determined by the GAP algorithm under default parameters. Applicants believe that 
one search encompasses all the sequences of the invention. As the restriction to 
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one sequence is at the discretion of the Examiner, it is hoped the actual analyses 
presented will convince the Examiner to rejoin SEQ ID NOS: 1, 3 and 5 for 
examination of this invention. Therefore, the Applicants respectfully request 
withdrawal of the sequence election in this application. 

Rejections under 35 U.S.C. §101 : 

Claims 1-10 are rejected under 35 U.S.C. §101 as not having either a credible 
asserted utility or a well-established utility. Claim 1 has been cancelled and 
rewritten as claims 12-15, the rejection will be discussed as it applies to these 
claims. 

The Examiner asserts that isolated polynucleotides of at least 30 contiguous 
nucleotides encompass nucleic acid sequences encoding a mannanase (Buchert, et 
al., 1997, US Patent 5,66,021), a human DNA repair protein (1998, GenBank 
AI184177), and/or an Arabidopsis RAD57 (Rounsley et al., 1998, GenBank 
022144), and that the instant specification does not teach a specific use of these 
nucleic acids. 

Applicants respectfully disagree. Applicants do teach a specific use for the 
polynucleotides claimed. In claim 9, Applicants claim a method to modulate the level 
of Rad51C in a plant using a polynucleotide of claim 12. For example, 
subsequences of a nucleic acid can be used to modulate the level gene expression, 
see page 32, lines 23-33; and page 44, lines 3-18. Therefore, subsequences of 
polynucleotide sequences of the present invention do have a specific utility. Not all 
embodiments must have utility for the invention as a whole to have utility. 
Inoperable embodiments of the claimed invention do not eliminate the utility of the 
operable embodiments. As it is stated in the MPEP 2107 II, page 2100-25: "... as 
the Federal Circuit has stated, '[t]o violate [35 U.S.C] 101 the claimed device must 
be totally incapable of achieving a useful result. ' Brooktree Corp. v. Advanced Micro 
Devices, Inc., 977 F.2d 1555, 1571, 24 USPQ2d 1401, 1412 (Fed. Cir. 1992)". 
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The nucleic acid sequence encoding a mannanase (Buchert, et a/., 1997, US 
Patent 5,66,021) shares 30 contiguous nucleotides with SEQ ID NO: 1 in the polyA 
tail region. Claim 12 recites "An isolated polynucleotide encoding a polypeptide with 
Rad51C activity", therefore the polyA tail of a nucleic acid encoding a mannanase is 
not encompassed by claim 12. Claim 15 claims "A polynucleotide comprising at 
least 50 contiguous nucleotides from a polynucleotide of SEQ ID NO: 1". Therefore, 
claim 15 requires at least 50 contiguous nucleotides, as such the sequence 
disclosed by Buchert, et al. is not encompassed by this claim. 

Applicants do not claim polynucleotide subsequences with a given percent 
identity. Applicants claim at least 80% sequence identity over the full length using 
the GAP program, a Global Alignment Program. Applicants separately claim a 
polynucleotide with at least 50 contiguous nucleotides. Therefore, the sequence of 
AI184177 is not encompassed in claims 12-15. 

Claim 12 recites "An isolated polynucleotide encoding a polypeptide with 
Rad51C activity". Therefore, nucleic acid sequences encoding a mannanase 
(Buchert, et al., 1997, US Patent 5,66,021), a human DNA repair protein (1998, 
GenBank AI184177), and/or an Arabidopsis RAD57 (Rounsley et a/., 1998, 
GenBank 022144) are not encompassed by claim 12. 

The sequence search results provided show an amino acid alignment of 
SEQ ID NO: 2 with an Arabidopsis RAD57 (Rounsley et a/., 1998, GenBank 
022144). This alignment shows 14 contiguous amino acids shared by the two 
sequences. Applicant submits evidence in Appendix B that shows that SEQ ID NO: 
1 and the polynucleotide encoding an Arabidopsis RAD57 as disclosed by Rounsley 
et al. do not share 30 contiguous nucleotides, even though both polynucleotide 
sequences encode 14 contiguous amino acids. Appendix B contains two 
alignments. First, an alignment (FrameAlign) of the polynucleotide SEQ ID NO: 1 
with the polypeptide of 022144 was done in order to identify the appropriate region 
of SEQ ID NO: 1. Second, a GAP alignment of the polynucleotide of SEQ ID NO: 1 
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and the polynucleotide encoding the polypeptide of 022144. This GAP alignment 
shows the two polynucleotide sequences do not share 30 or 50 contiguous 
nucleotides. Therefore, the sequence of 022144 was not encompassed in originally 
filed claim 1, or in new claims 12-15. 

Claim 14 claims a polynucleotide comprising at least 100 contiguous 
nucleotides which selectively hybridizes, under stringent conditions and a wash in 
0.1X SSC at 60°C. Support for this claim can be found in the originally filed claims 
and in the specification, for example page 32, lines 15-22. Applicants define 
"selectively hybridizes" on page 14, line 30 - page 15, line 3 of the specification. 
Sequences which selectively hybridize, under stringent conditions, hybridize at least 
2-fold over background and to the substantial exclusion of non-target nucleic acids. 
It is also noted, that selectively hybridizing sequences typically have at least about 
80% sequence identity with each other. "Stringent conditions" are defined and 
discussed on page 15, line 24- page 17, line 11, particularly page 16, lines 6-13. 
The role of post-hybridization washes is also discussed. Given that the 
polynucleotide must be a polynucleotide of at least 100 contiguous nucleotides 
which selectively hybridizes, under stringent conditions, and a wash in 0.1X SSC at 
60°C, nucleic acid sequences encoding a mannanase (Buchert, et a/., 1997, US 
Patent 5,66,021), a human DNA repair protein (1998, GenBank AM 841 77), and/or 
an Arabidopsis RAD57 (Rounsley et a/., 1998, GenBank 022144) are not 
encompassed by claim 14. 

Claim 15 claims "A polynucleotide comprising at least 50 contiguous 
nucleotides from a polynucleotide of SEQ ID NO: 1". Nucleic acid sequences 
encoding a mannanase (Buchert, et a/., 1997, US Patent 5,66,021), a human DNA 
repair protein (1998, GenBank AM 841 77), and/or an Arabidopsis RAD57 (Rounsley 
et al., 1998, GenBank 022144) are not encompassed by claim 15. 

New claims 12-15 do not encompass a subsequence of a nucleic acid 
encoding a mannanase (Buchert et a/., 1997, US Patent 5,661,021), a human DNA 
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repair protein (1998, GenBank AM 84177), and/or an Arabidopsis RAD57 (Rounsley 
et a/., 1998, GenBank 022144). Subsequences of polynucleotides of the present 
invention have utility as discussed throughout the specification, for example see 
page 32, lines 23-33; and page 44, lines 3-18. Therefore the rejection under 35 
U.S.C §101 of claims 1-10, as applied to claims 2-10 and 12-15, should be 
withdrawn. 

The Examiner states claim 8 is rejected under 35 U.S.C. §101 as not having a 
specific or well-established utility because the claim does not require that the 
transgenic seed have the expression cassette of claim 2. 

Applicants respectfully disagree. Claim 8 depends from claim 4, which claims 
a transgenic plant comprising a recombinant expression cassette of claim 2. As the 
transgenic plant is required to comprise a recombinant expression cassette of claim 
2, this requirement carries into the dependent claim 8 directed to a transgenic seed. 
Applicants respectfully request the rejection of claim 8 under 35 U.S.C. §101 be 
withdrawn. 

Applicants have properly addressed by argument and amendment the 
grounds for the rejection of originally filed claims 1-10 under 35 U.S.C. §101 as it 
would apply to pending claims 2-10, and 12-15, and respectfully request that the 
rejection of the claims under 35 U.S.C §101 be withdrawn. 

Rejections under 35 U.S.C. 5112, first paragraph - Utility : 

As the Applicants have responded to the utility rejection under 35 U.S.C. 
§101, the concomitant rejection of claims 1-10 under 35 U.S.C. §112, first paragraph 
based on a lack of utility should be withdrawn and not applied to pending claims 2- 
10, and 12-15. 
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Rejections under 35 U.S.C. §112. first paragraph : 

Claims 1-10 are rejected under 35 U.S.C. §112, first paragraph. The 
specification does not enable any person skilled in the art to which it pertains, or with 
which it is most nearly connected, to make and/or use the invention commensurate 
in scope with these claims. This rejection will be discussed as it pertains to original 
claims 2-10, and new claims 12-15. 

The Examiner states "Claims 1-10 are rejected under 35 U.S.C. §112, first 
paragraph, because the specification, while being enabling for nucleic acids of SEQ 
ID NO: 1 or that encode SEQ ID NO: 2, does not reasonably provide enablement for 
nucleic acids that have 80% identity to SEQ ID NO: 1 , that are amplified from 
primers that hybridize under unspecified stringency to 'loci within' SEQ ID NO: 1, or 
that comprise 30 nucleotides that hybridize to SEQ ID NO: 1." 

The Applicants respectfully disagree. The specification provides guidance for 
modification and variants of the polynucleotides and/or polypeptides of the instant 
invention (for example: page 7, line 12 -page 9, line 3; page 12, lines 15-31; page 
25, line 24 - page 26, line 6; page 28, line 27 - page 31 , line 29; page 30, lines 2- 
16; page 58, lines 5-29; and SEQ ID NOS: 1-6), guidance on sequence comparison 
and analyses (for example: page 9, lines 4-18; page 17, line 28 - page 23, line 2; 
page 28, line 27 - page 29, line 4; and Examples 3 and 4, pages 64-65), guidance 
on amplification of polynucleotides (for example: page 6, lines 1-11; page 26, line 8 

- page 28, line 3; and page 37, line 25 - page 38, line 15), guidance on hybridization 
of polynucleotides (for example: page 14, line 30 - page 15, line 3; page 15, line 24 

- page 1 7, line 1 1 ; page 28, lines 5-25; page 31 , line 31 - page 32, line 6; page 35, 
line 27 - page 36, line 22; and page 37, lines 7-24) and guidance on subsequences 
(for example: page 32, line 9 - page 33, line 2). Thus, Applicant respectfully submits 
that the specification does enable any person skilled in the art to which it pertains, or 
with which it is most nearly connected, to make and/or use the invention 
commensurate in scope with these claims. 
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The Examiner states "The instant specification, however, fails to provide 
guidance for which amino acids of SEQ ID NO: 2 can be altered and to which other 
amino acids, and which amino acids must not be changed to maintain RAD51 
activity of the encoded protein." 

Applicants respectfully disagree. The background discusses conserved 
sequences in the RAD51 family (see page 2, lines 13-19). Example 4 on pages 64- 
65 of the specification specifically points out conserved sequence found in SEQ ID 
NO: 2, including a functional domain, the Walker A box ATP-binding motif 
(highlighted). 

At the time of filing, it was well within the capabilities of one of skill in the art to 
determine which amino acids could be altered. For example, methods to assay for 
various functions and phenotypes associated with RAD51 homologues were well- 
known at the time of filing, as evidenced by the documents submitted by the 
Applicant in an IDS filed June 23, 2000. In particular, these references disclose 
several assay methods including yeast two-hybrid screens (Johnson & Symington 
1995; Dosanjh etal. 1998), DNA strand exchange (Sung 1994 and 1997; Sung and 
Robberson 1995), complementation (Vispe, etaL 1998), homologous recombination 
(Vispe etaL 1998; Xia etaL 1997), and gamma-irradiation (Johnson & Symington 
1995). Also, at the time of filing the structure of RecA and related proteins were 
known, as evidenced in Appendix C, and could be used to model the structure of 
Rad51-like sequences and serve as guidance for allowable modifications. One of 
skill in the art could also use multiple sequence alignments to identify putative 
residues and regions which allow modification, one such multiple sequence 
alignment is submitted in Appendix D. 

As is stated in MPEP 2164.01 "A patent need not teach, and preferably omits, 
what is well known in the art. In re Buchner, 929 F.2d 660, 661, 18 USPQ2d 1331, 
1332 (Fed. Cir. 1991); Hybritech, Inc. v. Monoclonal Antibodies, Inc., 802 F.2d 1367, 
1384, 231 USPQ 81, 94 (Fed. Cir. 1986), cert, denied, 480 U.S. 947 (1987); and 
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Lindemann Maschinenfabrik GMBH v. American Hoist & Derrick Co., 730 F.2d 1452, 
1463, 221 USPQ 481, 489 (Fed. Cir. 1984)." 

The Examiner states "It cannot be predicted by one of skill in the art that 
nucleic acids having 80% identity to SEQ ID NO: 1, that are amplified from primers 
that hybridize under unspecified stringency to 'loci within' SEQ ID NO: 1, or that 
comprise 30 nucleotides that hybridize to SEQ ID NO: 1 will encode a protein with 
the same activity as SEQ ID NO: 2." 

The Examiner cites Bowie et a/.(1990, Science) which teaches that protein 
structure prediction, and ascertaining functional aspects of the protein, from 
sequence data is extremely complex. The Examiner also cites Lazar et a/., Broun et 
a/., Burgess et a/., and Hill et a/., all of which provide examples of very specific 
limited amino acid changes which result in elimination or alteration of the 
experimental protein's catalytic activity. 

Applicant notes that Bowie et al. also teaches commonly used methods to 
predict tolerance of an amino acid sequence to change, observing tolerated 
substitutions in related sequences through evolution (e.g., see Applicants Appendix 
D), and genetic manipulation of sequence (page 1306, paragraph bridging columns 
1 and 2). Bowie etal. further reveals that studies using these methods reveal that 
proteins are highly tolerant of amino acid substitutions, with as many as one-half of 
all substitutions being phenotypically silent in lac repressor (page 1306, 1 st full 
paragraph column 2). As is noted above, methods to assay function, as well as the 
structure of related proteins were available at the time of filing, coupled with the 
disclosures of the present application, one of skill in the art was reasonably apprised 
of the scope of the invention. The invention is directed to compositions of RAD51C, 
its activities, and methods of use, non-functional embodiments are not claimed and 
do not eliminate the utility of the function embodiments set forth in the claims. 

The Examiner cites Reiss et al. (2000, PNAS) wherein plants transformed 
with RecA unexpectedly did not have an increase in gene targeting. 
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Applicants note that Reiss et ai did observe that RecA did increase the 
fidelity of the recombination (page 3363, left column, lines 1-2). Further, Reiss et a/, 
postulate that RecA may not have increased gene targeting due to unavailability of 
ssDNA substrate in the Agrobacterium-med\ated transformation method used (page 
3363, left column, 1 st full paragraph). 

The Examiner asserts that the instant specification fails to teach how nucleic 
acids encoding a mannanase, or how human or Arabidopsis nucleic acids that do 
not encode RAD51, cited in the 35 U.S.C. §101 rejection, could be used to modulate 
the level of maize RAD51 in a plant. 

As discussed in regard to the 35 U.S.C. §101 rejection, the claims of the 
present invention do not encompass non-RAD51-like nucleic acids or proteins. 

The Examiner asserts, given the claim breadth, unpredictability, and lack of 
guidance, that undue experimentation would have been required by one skilled in 
the art to practice the invention. 

The Applicants respectfully disagree. As noted above, Applicants have 
disclosed several sequences (SEQ ID NOS: 1-6), provided guidance regarding 
modifications to the sequences, methods to analyze, isolate, identify and 
characterize the sequences. The 3-dimensional structure of related proteins were 
known in the art at the time of filing, as well as methods to assay for functional 
RAD51 homologues. The question of experimentation is a matter of degree. The 
fact that some experimentation is necessary does not preclude enablement; what is 
required is the amount of experimentation must not be unduly extensive. PPG Inc. v. 
Guardian Industries Corp. (37 USPQ 1218, 1623, (Fed. Cir. 1996). The test is not 
merely quantitative, since a considerable amount of experimentation is permissible, 
if it is merely routine, or if the specification in question provides a reasonable amount 
of guidance with respect to the direction in which the experimentation should 
proceed to enable the determination of how to practice a desired embodiment of the 
invention claimed. Ex parte Jackson, 217 USPQ 804, 807 (1982 PTOBA). 
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Applicants have provided reasonable guidance such that one of skill in the art 
can practice the breadth of the invention as disclosed and claimed, therefore the 
rejection of claims 2-10 and 12-15 under 35 U.S.C. §112, first paragraph should be 
withdrawn. 

Claims 1-10 are rejected under 35 U.S.C. §112, first paragraph, as containing 
subject matter that was not described in the specification in such a way as to 
reasonably convey to one skilled in the relevant are that the inventor(s), at the time 
the application was filed, had possession of the claimed invention. 

The Examiner states: "Claim 1 recites no description of the function of the 
protein encoded by the nucleic acid and the plant claims recite no phenotype." 

Claim 1 has been cancelled and rewritten as new claims 12-15. This 
rejection will be addressed as it may be applied to these claims. 

Claim 12 recites the function of the protein in the preamble, "An isolated 
polynucleotide encoding a polypeptide with Rad51C activity". Therefore the 
rejection to claim 1 should not be applied to claim 12. 

Applicant notes that the phenotype of the transgenic plants claimed will 
depend on the components and orientations of recombinant expression cassette 
constructed. For example, a transgenic plant in which a developmental-specific, 
pollen-specific promoter is used to drive transcription of SEQ ID NO: 1 in the 
antisense orientation will have a different phenotype than a transgenic plant in which 
SEQ ID NO: 1 is operably linked in the sense orientation to a strong, constitutive 
promoter. Applicant need not recite the plant phenotype in the claims, the metes 
and bounds of claim 4 and dependent claims are clear, the transgenic plant 
comprises an isolated polynucleotide of claim 12 in a recombinant expression 
cassette. 

The Examiner cites Dosanjh era/. 1998 (Nucl. Acids Res. 26:1179-1184; 
1 183 right column, paragraph 2) to support the assertion that different RAD51 
proteins appear to have different functions within a cell. 
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Dosanjh et al. report the discovery of another member of the RAD51 gene 
family. The different members of the RAD51 gene family are components that 
function together in the recombinational repair of DNA (see page 1170, first 
paragraph, bridges columns 1 and 2). 

The Examiner cites University of California v. Eli Lilly, 119 F.3d 1559, 43 
USPQ d2 1398 (Fed Cir. 1997) which states "...there is no further information in the 
patent pertaining to that cDNA's relevant structural or physical characteristics; in< 
other words, it thus does not describe human insulin cDNA...". The Examiner also 
points out, on page 1046: "A definition by function, as we have previously indicated, 
does not suffice to define the genus because it is only an indication of what the gene 
does, not what it is." 

The Examiner also cites Amgen Inc. v. Chugai Pharmaceutical Co. Ltd., 18 
USPQ 2d 1016 at 1021 (Fed Cir. 1991). The Examiner directs Applicants 1 attention 
to page 1021: "Conception does not occur unless one has a mental picture of the 
structure of the chemical or is able to define it by its method of preparation, its 
physical or chemical properties, or whatever characteristics sufficiently distinguish 
it." 

It should be noted that in both of the above cases, the claim language 
focused on the biological properties of the claimed sequences. Applicants 
respectfully submit that the present invention is not defined solely by its biological 
property but is defined by structural features such as percent identity and 
hybridization fidelity. These structural features are readily understood by those 
practicing the art and are fully supported in the specification. For example, claim 12 
claims polynucleotides having at least 80% sequence identity to the polynucleotides 
of SEQ ID NO: 1; wherein the percent sequence identity is based on the entire 
coding regions and is determined by the GAP program under default parameters. 
While SEQ ID NO: 1 is clearly defined in the sequence listing, those polynucleotides 
having at least 80% sequence identity to the polynucleotides of SEQ ID NO: 1 are 
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clearly defined in the instant specification. The definition of sequence identity is 
taught on page 21 , line 29 - page 22, line 5 of the specification. The description of 
sequence similarity, methods for aligning sequences, and a description of the GAP 
program used to determine the percentage of sequence identity can be found on 
page 18, line 14 - page 23, line 2 of the specification. Further, methods of making 
the invention are also clearly taught. See, for example, pages 34 - 37 where library 
synthesis (page 34, line 6 - page 35, line 25; and page 36, line 24 - page 37, line 5), 
screening of DNA libraries (page 37, line 7 - page 38, line 15), amplification of 
polynucleotides (page 6, lines -1 1 ; page 26, line 8 - page 27, line 29; and page 37, 
line 25 - page 38, line 15), and synthetic preparation of polynucleotides (page 38, 
lines 17-31) are taught. See page 15, line 24 - page 17, line 11; and page 37, lines 
8-24 for a description of hybridization conditions. 

In Amgen v. Chugai, the Federal Circuit concluded that the patent 
specification was insufficient to enable one of ordinary skill in the art to make and 
use the invention claimed in claim 7 of the '008 patent without undue 
experimentation. As stated on page 1027, however, "it is not necessary that a 
patent applicant test all the embodiments of his invention, In re Angstadt, 537 F.2d 
498, 502, 190 USPQ 214, 218 (CCPA 1976); what is necessary is that he provide a 
disclosure sufficient to enable one skilled in the art to carry out the invention 
commensurate with the scope of his claims. For DNA sequences, that means 
disclosing how to make and use enough sequences to justify grant of the claims 
sought." Applicants respectfully submit, that has been done in the instant 
specification. The present invention discloses how to make and use the sequences 
of the invention, as discussed in the paragraph above. 

The question of experimentation is a matter of degree. The fact that some 
experimentation is necessary does not preclude enablement; what is required is the 
amount of experimentation must not be unduly extensive. PPG Inc. v. Guardian 
Industries Corp. (37 USPQ 1218, 1623, (Fed. Cir. 1996). 
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The present specification provides reasonable guidance with respect to the 
direction in which the experimentation should proceed by providing sequences, 
methods, citations and examples sufficient to practice the scope of the claims. 
While the methods require selection of transformed plants exhibiting the desired 
traits/phenotype, the selection is routine and would not require undue 
experimentation. No matter how much detail is provided, one will have to select for 
the desired phenotype. 

The test is not merely quantitative, since a considerable amount of 
experimentation is permissible, if it is merely routine, or if the specification in 
question provides a reasonable amount of guidance with respect to the direction in 
which the experimentation should proceed to enable the determination of how to 
practice a desired embodiment of the invention claimed. Ex parte Jackson, 217 
USPQ 804, 807 (1982 PTOBA). 

With the guidance provided in the present specification, one skilled in the art 
can readily practice the claimed invention. Therefore, it is respectfully requested 
that the rejection of claims 1-10 under 35 U.S.C. §1 12, first paragraph be withdrawn 
and not applied to pending claims 2-10, and 12-15. 

Rejections under 35 U.S.C. S112. second paragraph : 

Claims 1-10 have been rejected under 35 U.S.C. § 112, second paragraph as 
being indefinite. Claim 1 has been cancelled and rewritten as claims 12-15. This 
rejection will be addressed as it applies to pending claims 2-10, and 12-15. 

The Examiner asserts claim 1 is indefinite in its recitation of "GAP algorithm". 
This rejection will be discussed as it applies to new claim 12. 

New claim 12 recites "GAP program", as recommended by the Examiner, 
therefore the rejection should not be applied to claim 12. 

The Examiner asserts claim 1 , parts (c) and (d) are indefinite in the recitation 
of "stringent hybridization conditions" and "selectively hybridize(s)". This rejection 
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will be discussed as it applies to new claims 13 and 14. The Examiner further 
asserts that claim 1 , part (d) is indefinite for not indicating the length of wash time. 
This rejection will be discussed as it applies to claim 14. 

Hybridization is a common technique to those of skill in the art, as is 
illustrated by the availability of commercial kits as well as several standard 
references including Sambrook etai (1989) Molecular Cloning - A Laboratory 
Manual, 2 nd Ed., Cold Spring Harbor Press; Ausubel et a/., Eds. (1994) Current 
Protocols in Molecular Biology, Greene Publishing Assoc., Inc. and John Wiley and 
Sons, Inc.; and Berger and Kimmel, Guide to Molecular Cloning Techniques, 
Methods in Enzymology, Vol. 152, Academic Press. 

As it is defined in the specification on page 14, line 30 - page 15, line 3, 

"selectively hybridize(s)": 

'includes reference to hybridization, under stringent hybridization conditions, 
of a nucleic acid sequence to a specified nucleic acid target sequence to a 
detectably greater degree (e.g., at least 2-fold over background) than its 
hybridization to non-target nucleic acid sequences and to the substantial 
exclusion of non-target nucleic acids/ 

"Selectively hybridizes", as defined in the specification, indicates selective 
hybridization is at least 2-fold over background as compared to a non-target nucleic 
acid under stringent hybridization conditions. 

"Stringent hybridization conditions" are discussed extensively on pages 15-17 
of the specification. Stringent conditions are those at which the probe will 
selectively hybridize to its target at least 2-fold over background as compared to 
non-target nucleic acids. It is noted that these conditions will be sequence 
dependent, but guidance on conditions is given on pages 16-17. 

The role of wash conditions is discussed on pages 16-17. While temperature 
and ionic strength are viewed as important factors, the time of the wash, in general 
is not. Claim 14 defines the critical wash parameters of ionic strength and 
temperature as 0.1 X SSC at 60°C. 
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The test for definiteness is whether one skilled in the art would understand 
the bounds of the claim when read in light of the specification. The Examiner is 
reminded, to satisfy the requirements of §112, second paragraph, the claims need 
only "reasonably apprise those skilled in the art" as to their scope and be "as precise 
as the subject matter permits". Hybritech Inc. v. Monoclonal Antibodies, Inc., 802 
F.2d 1367, 231 USPQ 81 (Fed. Cir. 1986), cert, denied, 480 U.S. 947 (1987). The 
language of claims 13 and 14 is not ambiguous when read in light of the 
specification. 

Accordingly, claims 13 and 14 fulfills the requirements of 35 U.S.C. §1 12, 
second paragraph, and the Examiner is respectfully requested to withdraw the 
rejection of claim 1, parts (c) and (d) and not apply the rejection to newly submitted 
claims 13 and 14. 

The Examiner asserts that claim 7 is not written in proper Markush format, as 
it has improper punctuation after the phrase "selected from the group consisting of. 

The Applicant has amended claim 7 to remove the colon after the phrase 
"selected from the group consisting of, as recommended by the Examiner. Claim 7 
is now in proper form and the rejection under 35 U.S.C. §112, second paragraph 
should not be applied to the amended claim. 

Applicant has addressed the rejections under 35 U.S.C. §112, second 
paragraph by proper amendments and arguments. Claims 2-15 are in proper form, 
therefore Applicant respectfully requests the rejections under 35 U.S.C. §112, 
second paragraph be withdrawn. 

Rejections under 35 U.S.C. S 102 : 

Claims 1-3 have been rejected under 35 U.S.C. § 102(b) as being anticipated 
by Buchert etal. (1997, US Patent 5,661,021). 
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The Examiner asserts "Buchert et al. teach a polynucleotide comprising at 
least 30 contiguous nucleotides of SEQ ID NO: 1. This nucleic acid was in an 
expression cassette and expressed in yeast cells." 

Claim 1 has been cancelled and rewritten as claims 12-15. The rejection will 
be addressed as it may be applied to new claims 12-15, as well as claims 2 and 3. 

The Applicants respectfully traverse the rejection under 35 U.S.C. § 102(b). 
As it is stated in the MPEP 2131 page 2100-54 "To anticipate a claim, the reference 
must teach every element of the claim. *A claim is anticipated only if each and every 
element as set forth in the claim is found, either expressly or inherently described, in 
a single prior art reference.'" 

Buchert et al. teach mannanase enzymes and uses thereof. Claims 12 and 
13 claim isolated polynucleotides which encode a polypeptide with RAD51C activity. 
Claim 14 claims an isolated polynucleotide comprising at least 100 contiguous 
nucleotides which selectively hybridizes, under stringent conditions, to SEQ ID NO: 
1 . Claim 15 claims and isolated polynucleotide comprising at least 50 contiguous 
nucleotides from a polynucleotide of SEQ ID NO: 1 . Buchert et al. does not disclose 
polynucleotides which encode a polypeptide with RAD51C activity, polynucleotides 
of at least 100 contiguous nucleotides which selectively hybridize to SEQ ID NO: 1 
of the instant invention, or polynucleotides of at least 50 contiguous polynucleotides 
of SEQ ID NO: 1, therefore Buchert et al. does not anticipate claims 12-15, or claims 
2 and 3. 

Claims 1-3 have been rejected under 35 U.S.C. § 102(a) as being anticipated 
by NCI-CGAP (1998, Gen Bank Accession No. AM84177). 

The Examiner asserts "NCI-CGAP teaches a nucleic acid that comprises 40 
nucleotides with more than 80% sequence identity to SEQ ID NO: 1 . This nucleic 
acid would selectively hybridize to SEQ ID NO: 1. This nucleic acid would be in an 
expression cassette like that of a pUC or similar vector and this would be in a host 
cell for purposes of molecular biological manipulation. 
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Claim 1 has been cancelled and rewritten as claims 12-15. The rejection will 
be addressed as it may be applied to new claims 12-15, as well as claims 2 and 3. 

Applicants respectfully disagree, claims 2 and 3 depend from new claim 12. 
New claim 12 does not encompass polynucleotides which hybridize to SEQ ID NO: 
1. Claim 14 claims an isolated polynucleotide comprising at least 100 contiguous 
nucleotides which selectively hybridizes, under stringent conditions, to SEQ ID NO: 
1 . NCI-CGAP does not teach a nucleic acid of 1 00 contiguous nucleotides which 
would selectively hybridize under the claimed conditions, therefore NCI-CGAP does 
not anticipate claims 12-14, or claims 2-3. 

Claims 1-3 have been rejected under 35 U.S.C. § 102(b) as being anticipated 
by Rounsely etal. (1998, GenBank Accession No. 022144). 

The Examiner asserts "Rounsley et al. teach a nucleic acid that comprises 42 
contiguous nucleotides that encodes SEQ ID NO: 2. This nucleic acid would be in 
an expression cassette like that of a pUC or similar vector and this would be in a 
host cell for purposes of molecular biological manipulation." 

Claim 1 has been cancelled and rewritten as claims 12-15. The rejection will 
be addressed as it may be applied to new claims 12-15, as well as claims 2 and 3. 

The sequence search results provided show an amino acid alignment of SEQ 
ID NO: 2 with an Arabidopsis RAD57 (Rounsley etal., 1998, GenBank 022144). 
This alignment shows 14 contiguous amino acids shared by the two sequences. 
Applicant submits evidence in Appendix B that shows that SEQ ID NO: 1 and the 
polynucleotide encoding an Arabidopsis RAD57 as disclosed by Rounsley et al. do 
not share 30 contiguous nucleotides, even though both polynucleotide sequences 
encode 14 contiguous amino acids. This GAP alignment shows the two 
polynucleotide sequences do not share 30 contiguous nucleotides. Further, 
Rounsley et al. do not disclose a polynucleotide, or any vectors or host cells 
comprising a polynucleotide, those features are merely inferred by the Examiner. 
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Therefore, the sequence of 022144 does not anticipate originally filed claims 1-3, or 
new claims 12-15. 

Applicants respectfully request that the rejections to claims 1-3 under 35 
U.S.C. § 102(a) and 102(b) should be withdrawn and not applied to new claims 12- 
15. 

Rejections under 35 U.S.C. S 103 : 

Claims 1-4, 6, and 8-9 have been rejected under 35 U.S.C. § 103(a) as being 
unpatentable over Reiss era/. (1996, Proc. Natl. Acad. Sci. 93:3094-3098) in view of 
Rounsley et al. (supra). 

Claim 1 has been cancelled and rewritten as claims 12-15. The rejection will 
be addressed as it may be applied to new claims 2-4, 6, 8-9, and 12-15. 

New claims 12-15, disclose novel and non-obvious RAD51-like sequences 
that are not disclosed in Reiss et al. or Rounsley et al. separately, or in combination. 
The disclosure of Rounsley et al. is discussed above and illustrated in Appendix B. 
The combination of Reiss et al. and Rounsley et al. does not yield the 
polynucleotides, methods, or compositions of the present invention. It is respectfully 
requested that the rejection of claims 2-4, 6, and 8-9 under 35 U.S.C. § 103(a) be 
withdrawn, and that the rejection not be applied to new claims 12-15. 

Claims 5, 7, and 10 have been rejected under 35 U.S.C. § 103(a) as being 
unpatentable over Reiss etal. (supra) in view of Rounsley et al. (supra), and further 
in view of Gordon-Kamm etal. (1990, Plant Cell 2:603-61 8). 

Claims 5, 7 and 10 depend from new claim 12, which discloses novel and 
non-obvious RAD51-like sequences that are not disclosed in Reiss etal. or 
Rounsley et al. or Gordon-Kamm etal. separately, or in combination. The 
combination of Reiss et al., Rounsley et al. and Gordon-Kamm et al. does not yield 
the compositions of claims 5, 7, and 10. Therefore it is respectfully requested that 
the rejection of claims 5, 7, and 10 under 35 U.S.C. § 103(a) be withdrawn. 
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CONCLUSION 



In light of the foregoing remarks and amendments, withdrawal of the 
outstanding rejections and allowance of all of the remaining claims is respectfully 
requested. Applicants believe that the claims are in condition for allowance. The 
Examiner is invited to telephone the Applicant in order to expedite prosecution of the 
application. 



PIONEER HI-BRED INTERNATIONAL, INC. 

Corporate Intellectual Property 

7100 N.W. 62 nd Avenue 

P.O. Box 1000 

Johnston, Iowa 50131-1000 

Phone: (515)270-4192 

Facsimile: (515)334-6883 



Respectfully submitted, 




Virginia Dress 
Agent for Applicant(s) 
Registration No. 48,243 
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VERSION WITH MARKINGS TO SHOW CHANGES MADE 

The Applicants have used underlining to denote additions to the original text 
and square brackets [ ] to denote deletions of the original text. 

In the Title : 

The title found on the cover page has been amended as follows: 
[A Novel Maize] Rad51-Like [Gene] Ortholoaues and Uses Thereof 
In the Specification : 

Paragraph beginning at line 3 of page 19 has been amended as follows: 

Software for performing BLAST analyses is publicly available, e.g., through 
the National Center for Biotechnology Information [(http://www.ncbi.nlm.nih.gov/)]. 
This algorithm involves first identifying high scoring sequence pairs (HSPs) by 
identifying short words of length W in the query sequence, which either match or 
satisfy some positive-valued threshold score T when aligned with a word of the 
same length in a database sequence. T is referred to as the neighborhood word 
score threshold. These initial neighborhood word hits act as seeds for initiating 
searches to find longer HSPs containing them. The word hits are then extended in 
both directions along each sequence for as far as the cumulative alignment score 
can be increased. Cumulative scores are calculated using, for nucleotide 
sequences, the parameters M (reward score for a pair of matching residues; always 
> 0) and N (penalty score for mismatching residues; always < 0). For amino acid 
sequences, a scoring matrix is used to calculate the cumulative score. Extension of 
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the word hits in each direction are halted when: the cumulative alignment score falls 
off by the quantity X from its maximum achieved value; the cumulative score goes to 
zero or below, due to the accumulation of one or more negative-scoring residue 
alignments; or the end of either sequence is reached. The BLAST algorithm 
parameters W, T, and X determine the sensitivity and speed of the alignment. The 
BLASTN program (for nucleotide sequences) uses as defaults a wordlength (W) of 
1 1 , an expectation (E) of 10, a cutoff of 100, M=5, N=-4, and a comparison of both 
strands. For amino acid sequences, the BLASTP program uses as defaults a 
wordlength (W) of 3, an expectation (E) of 10, and the BLOSUM62 scoring matrix 
(see Henikoff & Henikoff (1989) Proc. Natl. Acad. Sci. USA 89:10915). 

Paragraph beginning at line 8 of page 64 has been amended as follows: 

Gene identities were determined by conducting BLAST (Basic Local 
Alignment Search Tool; Altschul, S. F., et al., (1990) J. Mol. Biol. 215:403-410[; see 
also www.ncbi.nlm.nih.gov/BLAST/]) searches under default parameters for 
similarity to sequences contained in the BLAST "nr" database (comprising all non- 
redundant GenBank CDS translations, sequences derived from the 3-dimensional 
structure Brookhaven Protein Data Bank, the last major release of the SWISS-PROT 
protein sequence database, EMBL, and DDBJ databases). The cDNA sequences 
were analyzed for similarity to all publicly available DNA sequences contained in the 
"nr" database using the BLASTN algorithm. The DNA sequences were translated in 
all reading frames and compared for similarity to all publicly available protein 
sequences contained in the "nr" database using the BLASTX algorithm (Gish, W. 
and States, D. J. Nature Genetics 3:266-272 (1993)) provided by the NCBI. In some 
cases, the sequencing data from two or more clones containing overlapping 
segments of DNA were used to construct contiguous DNA sequences. 
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In the Abstract : 

The Abstract beginning at line 1 of page 68 has been amended as follows: 

ABSTRACT OF THE DISCLOSURE 
The invention provides isolated [maize] RAD51C nucleic acids and their encoded 
proteins. The present invention provides methods and compositions relating to 
altering [maize] RAD51C levels in plants. The invention further provides 
recombinant expression cassettes, host cells, transgenic plants, and antibody 
compositions. 

In the Claims : 

Claims 1 and 1 1 have been cancelled without prejudice. 
Claims 2, 7, 9 and 10 have been amended as follows: 

2. (Amended) A recombinant expression cassette comprising a member of claim 
[1] 12 operably linkedf, in sense or anti-sense orientation,] to a promoter. 

7. (Amended) The transgenic plant of claim 4, wherein said plant is selected 
from the group consisting of[:] maize, soybean, sunflower, sorghum, canola, 
wheat, alfalfa, cotton, rice, barley, and millet. 

9. (Amended) A method of modulating the level of [maize] RAD51C in a plant, 
comprising: 
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(a) introducing into a plant cell a recombinant expression cassette 
comprising a [maize RAD51] polynucleotide of claim [1] 12 operably 
linked to a promoter; 

(b) culturing the plant cell under plant cell growing conditions; 

(c) regenerating a whole plant which possesses the transformed 
genotype; and 

(d) inducing expression of said polynucleotide for a time sufficient to 
modulate the level of [maize] RAD51C in said plant. 

10. (Amended) The method of claim 9, wherein the plant is selected from the 
group consisting of maize , soybean, sunflower, sorghum, canola. wheat, 
alfalfa, cotton, rice, barlev. and millet . 

New claims 12-15 have been added as follows: 

12. An isolated polynucleotide encoding a polypeptide with Rad51C activity 
comprising a member selected from the group consisting of: 

(a) a polynucleotide having at least 80% sequence identity over the entire 
length of the reference sequence, as determined by the GAP program 
under default parameters, to a polynucleotide of SEQ ID NO: 1; 

(b) a polynucleotide encoding a polypeptide of SEQ ID NO: 2; 

(c) a polynucleotide of SEQ ID NO: 1 ; 

(d) a polynucleotide which is fully complementary to a polynucleotide of 
(a), (b), or(c). 

1 3. An isolated polynucleotide amplified from a Zea mays nucleic acid library 
using primers which selectively hybridize, under stringent hybridization 
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conditions, to loci within a polynucleotide of SEQ ID NO: 1 , wherein the 
polynucleotide encodes a polypeptide with Rad51C activity. 



14. An isolated polynucleotide comprising at least 100 contiguous nucleotides 
which selectively hybridizes, under stringent hybridization conditions and a 
wash in 0.1X SSC at 60°C, to a polynucleotide of SEQ ID NO: 1 . 



15. An isolated polynucleotide comprising at least 50 contiguous nucleotides from 
a polynucleotide of claim 12. 
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*Gap Results 



Gap Results 



GAP of: 1107sid3 check: 3152 from: 1 to: 1456 
WPDEF Case 1107 SEQ ID NO: 3 

Case 1107 Rad51-like sequences SEQ ID NO: 3 from SEQ LISTING 

to: 1107sid5 check: 9084 from: 1 to: 1333 
WPDEF Case 1107 SEQ ID NO: 5 

Case 1107 SEQ ID NO: 5 from SEQ LISTING . Rad51-like sequences 

Symbol comparison table: nwsgapdna . cmp CompCheck: 8760 

Gap Weight: 50 Average Match: 10.000 

Length Weight: 3 Average Mismatch: 0.000 

Quality: 12858 Length: 1470 

Ratio: 9.646 Gaps: 4 

Percent Similarity: 99.318 Percent Identity: 99.242 

Match display thresholds for the alignment (s) : 
I = IDENTITY 
: = 5 
• = 1 



1107sid3 x 1107sid5 



Page 1 of 4 



August 28, 2001 16:14 . 



1 cgacgtaagcggctgcgtggcgccaccgacggaggctacgagcggttgtq 50 

, MII| !MIIIIIIIIMIilll|||M||MIIIII||||||MIIIM 
1 cgacgtaagcggctgcgtggcgccaccgacggaggctacgagcggttgtg 50 

51 gaggcagatatgagaggtggaggtggctacaacgggtcggcggctgtgaq 100 

""'""IIMMIIMMMIIIMMIIMMIIIIMMIIMM 
51 gaggcagatatgagaggtggaggtggctacaacgggtcggcggctgtgag 100 

101 atactgaaatccgcactgcagttctcttcttcccccaatcagtaccacct 150 

IIMIMMMIIIIIMIMIIMIIMMIMIIIIIIIIIIIIIMI 
iui atactgaaatccgcactgcagttctcttcttcccccaatcagtaccacct 150 

151 ctccaagtggcaatcaccatggga. . . caatctggctctagaaatggacc 197 

MMIIIIIIIIIIIIIMMIII f I I I I I I I I f | | [ | | | | | | | f | | 
151 ctccaagtggcaatcaccatgggagatcaatctggctctagaaatggacc 200 

198 acaacagaagtacgtttcaggagcccagaatgcctgggatatgttctctg 247 

9m " "MM I MM MM MINIMI I II M I II I I I I I II I I II 

^Ui acaacagaagtacgtttcaggagcccagaatgcctgggatatgttctctg 250 

248 atgagctgtcacagaaacacatcactactggttctggtgacctcaatgac 297 
''''' 1 I I I I I II I II I I I II II || | | M I M II II I II II I I II I II 
atga 9 ct g t cacagaaacacatcactactggttctggtgacctcaatgac 300 

298 atacttggtggcgggattcactgcaaagaagttactgagatcggtggcgt 347 
, ni 11 11111 !| l I III MM I Mill II MM Ml Mill I I III I II I III 
JUl atacttggtggcgggattcactgcaaagaagttactgagatcggtggcgt 350 

348 cccaggggttggtaaaactcaactggggattcaactagcaatcaatgtac 397 

''' ' ' ' 'I I I I I I I I I I I II I I I I I || || II I II II I II I I II M II II I 
J3i cccaggggttggtaaaactcaactggggattcaactagcaatcaatgtac 400 

398 aaatcccagtggaatgtggtggccttggtgggaaagcagtttatat. . . '. 443 
http://eshpc02.es.dupont.com:8000/project/training_kkl/result/1107sid3_gap_42532.h^ 
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401 
444 



<Gap Results 

1 1 1 1 1 1 I I I I I I I 1 1 I I I | | | | | | | | 1 1 | | | , M | | | , 

aaatcccagtggaatgtggtggccttggtgggaaagcagtttatatagat 4 50 

' ' m ??u m m ^fggttgaacgtgtctaccagattgctgaagggtg 4 91 

451 acagagggcagtttcatggttgaacgtgtctaccagattgctgaagggtg 500 

tattagggacatactggagcactttccgcacagccatgagaagtcctctt 541 
' 'II' ' ' ' ' " ' ' ' ' 1 " 1 11 1 1 ' 1 1 1 1 1 1 " I I I I I I I I I IN I M IN 
tattagggacatactggagcactttccgcacagccatgagaagtcctctt 550 

542 ctgtccaaaaacaattacagcctgagcgt^tcctggcgg^atctattac 591 
IINMMMMMMilMIIIIIIIIIIIIIIIIIIIIIIIIIIIIII 

ctgtccaaaaacaattacagcctgagcgtttcctggcggatatctattac 600 
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492 
501 



551 
592 
601 



M??? mm ?????? CaCCgaaCaaattgCagtcataaactacat gga 641 
ttccggatatgcagttacaccgaacaaattgcagtcataaactacatgga 650 

642 gaagttcctcagagagcataaagatgtgcgtatagttattattgatagtg 691 
«, ""''"'"""IMIIIIMIIIMIIIIMliiiiiniiii,,! 
651 gaagttcctcagagagcataaagatgtgcgtatagttattattgatagtg 700 

701 tla^t^i ' / 1/ M ' ' ' ' MIIM M 11111111 11 MIMIIIIIII 

701 ttactttccactttcgacaagattttgaagatctggcactgaggaccaga 750 

742 gtgctaagtggattatcattgaagttaatgaagattgcaaagacatataa 



791 



75! 1!!!! MI ' IIIIIIIIIMMIM IIIIIIIIIIIMMIMIIIIIII 
751 gtgctaagtggattatcattgaagttaatgaagattgcaaagacatataa 800 

792 cttggcagttgtcttgttgaaccaagtcactactaaatttacagaagggt 841 

80 , """ M """ IIIIIMMIIIIIIIIIMimi 

801 cttggcagttgtcttgttgaaccaagtcactactaaatttacagaagggt 850 

M M M M m???m M??^ Cta9gtgaCagCtggtCCCactcat g cac g 891 
851 catttcaattgactcttgctctaggtgacagctggtcccactcatgcacg 900 

892 aaccggttgattctgcactggaatgggaaigaacgatacgcacatcttga 941 

q 01 ]' 111 'li" 111 "" 111 111,11111 11 111 'I II I III I III I III 

901 aaccggttgattctgcactggaatgggaacgaacgatacgcacatcttga 950 

942 m m n M?M???M????^ agcctcagccccgtatgcagtgaca gg ca "1 

' 1 M ' 1 ' I I I I I I I I I I I M I I I I M I I I I f I I I | | | | | j j j | f J t I 
ybl taagtctccttcacttccagtagcctcagcaccgtatgcagtgacaggca 1000 

992 aa gggattagagatg.tgtgagctcaaaccacaagcgag;ccgagtaac g 1040 
1001 11 ""''""IN I I I M M I I I I | | | | M | | | ,, ,, , | , , ,, , , , , 
1001 aagggattagagatgctgtgagctcaaaccacaagcgagcccgagtaacg 1050 

1041 ; ag ^tcttggtgtcaagcacttgtatg;c C actacgc;cct g cagctt 1090 
in ,. "" I I 1 I 1 I I I I N I I I I I I I I I I | M I I I I I I I M I I I I I M I I I II 
1051 tagcattcttggtgtcaagcacttgtatgtccactacgctcctgctgctt 



1100 



1091 tcttcgccatggatcttttggactagtgaggtgagactggagaatagtac 1140 
1101 ^'"""""""'"""IIIIIIIIIMIIMIIIIIMIII 
1101 tcttcgccatggatcttttggactagtgaggtgagactggagaatagtac 1150 

1141 cat....ttgattctcagttgctttgtgccgttggctaccaaccaacctt 1186 

nm " ' 1 " 1 ' 1 1 1 1 1 I 1 1 I I I N II I M I I II I I I I I I | | | | | I | I I | 

1151 cattttgttgattctcagttgctttgtgccgttggcticcllicaaic^ 1200 

http://eshpc02.es.dupontxom* 
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Gap Results 

1187 



1201 aagagagaagtaaatacaacagaacaggctaatatagtgiiitgiiiiii 1250 
1237 aacatctggcccatcgtacattcagtaaagcctataat 1285 

1?S1 ilr-Ii i • ' ' " 1 1 1 11 1 I N I I I I I II I I | | | I I I I | I I I I | | | | | | 

1251 aacatctggsccatcgtacattcagtaaagcctataatagcgggiailia 1300 

1287 Mi?r^nn?t^r cg ? tc t gc Mt?t?tr aaaaaaaaaaaaaaa; 1336 

1301 tgtgcttctctgatcaaaaaaaaaaaaaaaaaa 1333 
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Input Sequence: 1107sid3 



! ! NA_SEQUENCE 1 . 0 

WPDEF Case 1107 SEQ ID NO- 3 ^ 

n"sJd3 7 Lenat'h-'l^rT 11063 SEQ 10 N ° : 3 fr ° m SE ^ LISTING 13 
Check: 3152 UgUSt 28 ' 2001 15:56 T yP e: « I 

II 



agcggttgtg 09909 " 339 " gg ° tgCgtgg ^gccaccgac ggaggctacg 
51 gaggcagata tgagaggtgg aggtggctac aacgggtcgg 



View Sequence 



fit! 

mi 



Input Sequence: 1107sid5 



! ! NA_SEQUENCE 1 . 0 " ■ — 

WPDEF Case 1107 SEQ ID NO- 5 

Sque^s 7 10 5 fr ° m SE ° LISTING - ^51-li k e 

lllll^OS?^ 1333 AU9USt 28 ' 2001 16:00 N 

agcggttgtg 0 "^ 3390 ggctgcgtgg cgccaccgac ggaggctacg 



View Sequence 



it 

j3 



http://eshpc02xs.dupontxom:8000/proj e ct/training_kkl/result/1107sid3^^ 



42532.htm 
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htm 
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Gap Results 



Page 1 of 2 



GAP of: UOTsidi check: 3715 from: 1 to: 281 



WPDEF Case 1107 SEQ ID NO- 4 

Case 1107 SEQ ID NO: 4 from SEQ LISTING Rari 51 tu 

v ^surib. Rad 51-like sequences. Protein 



to: 1107sid 6 check: 4041 from: 1 to: 294 



WPDEF Case 1107 SEQ ID NO- 6 

C. U07 SEQ „ m: s from SEQ LISTING. R ad 51 - lite sequences . Protei „ 
SS'.T To»5-?0, r Pr0tei " bl ° 0kS - *"■=• «■«• Acad. 



Gap Weight: 
Length Weight: 

Quality 
Ratio 

Percent Similarity 



8 Average Match: 2.912 

2 Average Mismatch: -2.003 

Length: 294 

an Ga P S: 1 

99.644 Percent Identity: 99 644 



Match display thresholds for the alignment (s ) ■ 
I = IDENTITY 
: = 2 
. = 1 



1107sid4 x 1107sid6 



August 28, 2001 16:17 



1 ^^SGSRNGPQQKYVSGAQNAWDMFSDELSQKHITTGSGDLNDILGGGi 50 
1 5Q 

51 HCKEVTEIGGVPGVGKTQLGIQLAINVQIPVECGGLGGKAVYI EGSFM 98 

" mm H n n nm "™?"f*f sv «^™diy™icsy i 48 

" 9 ZSZnr i,IIM ™«™« 198 

- TEQIAVINYMEKFLREHKDVRIViidsv^ 

199 ^'""""VyLLNQVTTKmGSFOLT^I.GDSWSHSCTNRI.IL, 248 

- ^^^^ ^"J™ 25Q 
^^^^ | 

251 WNGNERYAHLDKSPSLPVASAPYAVTGKGIRDAVSSNHKRARVT 294 



Input Sequence: H07sid4 

http://eshpc02.es.dupo„,.o om : 8 000/projecV.raini„g.lckl/^ 
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! !AA_SEQUENCE 1.0 ~ M 

WPDEF Case 1107 SEQ ID NO- 4 

Case 1107 SEQ ID NO: 4 from SEQ LISTING. Rad 51-like 
sequences. Protein 

Cneck-'sns^^-' ^ 28 ' 2001 16:06 ^ * 

DLNDILGGGI MGDQSGSRNG PQQKYVSGA Q NAWDMFSDEL SQKHITTGSG 



i 



View Sequence 



Input Sequence: 1107sid6 



! !AA_SEQUENCE 1.0 
WPDEF Case 1107 SEQ ID NO* 6 

«es 7 . S p E ^e D in° : " ^ ^ LIS ™- 
Cneckf^l^^ ^ 2001 16:07 *ype: P 

DLNDILGGGI MGDQSGSRNG PQQKYVSGA 2 NAWDMFSDEL SQKHITTGSG 



ii'i 
ni 



View Sequence 



http://eshpc02.es.dupontxom:8000/project/training_kkl/result/1107sid4 



_gap_42481.htm 



02/25/2002 



Gap Results 



Gap Results 



GAP of: UOTsidl check: 4817 from: 1 to: 1474 
WPDEF Case 1107 SEQ ID NO- 1 

Case 1107 Rad51-l ike sequences. From SEQ LISTING . 

to: 1107sid3 check: 3152 from: l to: 1456 
WPDEF Case 1107 SEQ ID NO- 3 

Case 1107 Rad51-like sequences SEQ ID NO: 3 f rom SEQ LISTING 
Symbol comparison table: m^oWcmp CompCheck: 8760 



Gap Weight: 
Length Weight: 

Quality 
Ratio 

Percent Similarity 



50 Average Match: 10.000 

3 Average Mismatch: 0.000 

l 2 i°ai Len ^ h: "11 

99.924 Percent Identity: 99.924 



Match display thresholds for the alignment (s) 
I = IDENTITY 
: = 5 
. = 1 



1107sidl x 1107sid3 



August 28, 2001 16:13 



Page 1 of 4 



101 ^Mcgcgg^ ^ 

1 MINIMI 

cgacgtaag 9 

x. c^=^« m mum u u n n ^ 

« -t^^i™ ^ - >< c » H M M M , , M M M ^ 

no -e^^-^;---; ^ ^ H n n n n n n n n n u n ^ 

301 gcaatcaccatgggagatcaatctggctctaaaaafoo.^ 

ggga. . -caatctggctctagaaatggaccacaacagaa 206 

, «""—LlU'-»>'ll«,idl J .,..„| ( . h> , 



02/25/2002 



Gap Results 

307 ggcgggattcactgcaaag^ttactgagatcggtggcgtcccaggggt 
501 tggtaaaactcaactggggattcaactagcaatcaatgtacaaatcccag 550 

407 tggaatgtggtggccttggtgggaaagcagtttatat ag I ggg c 450 

M M M?M M M m m?u ^r-^ttgctgaaggg^gtattaggga 650 
4 51 agtttcatggttgaacgtgtctaccagattgctgaaggg^ia^aggga 500 

501 catactggagcactttccgcacagccatgagaagtcctcttctgtccaaa 550 

„, I IMI 'i IIIIIMIIIM IIIIIIIIIIIIIMIIIIMI|IMI|||| 
551 aacaattacagcctgagcgtttcctggcggatatctattacttccgg^a 600 

M M M M M M 9aa ? aaa " g ? a ? tcataaacta -tggagaagttcct 800 
601 tgcagttacaccgaacaaattgcagtcataaactacatggagaagttcit 650 
801 ^gag-taaagat^ 85 

11 1 1 1 1 !|l I I II I I I I I I II II I I I I | | | | | II I I I I I M I I I I I II 
651 cagagagcataaagatgtgcgtatagttattattgatagtgttacittic 700 

?M M MHM M M " gaaga ^^^ a<it g a ^ a cca g agtgctaagt 900 
701 actttcgacaagattttgaagatctggcactgaggaccagagtgctaagt 750 

901 ???M?M?M?? ag ? taa ^ gaagattgCaaa g aca tataacttggcagt 950 

7Si "'^'/"""'""""""""llltMIIIIIIIMIIlTl 

751 ggattatcattgaagttaatgaagattgcaaagacatatalittggcagt 800 

M M M?M M??????MM^ Ctaaa " taCagaa ^ tcatttca at 1000 
801 tallll 11 '" '" """"INI III Mil II Mm | |,,| 
801 tgtcttgttgaaccaagtcactactaaatttacagaagggtcatttcaat 850 

1001 tgactcttgctctaggtga^agctggtcccactcatgca^gaaccggttg 1050 
/' "I 'ii" 1 1 1 11 1 I I M I I I I I | | M | | Ml I Ml I I MM if 
851 tgactcttgctctaggtgacagctggtcccactcatgcacgaiciggttg 900 
1051 ^9««gg^ 

901 III I I ' " ' ' 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 11 N N II II I M M I I 
901 attctgcactggaatgggaacgaacgatacgcacatcttgataagtctcc 950 

1101 M M M M M M????M??? a ???^ tgCagtgaCaggCaaagggatta H50 
951 llll 11 ""'"""III ' I I I I I N I I I I II II I || M I I II I M 
951 ttcacttccagtagcctcagccccgtatgcagtgacaggcaaagggatta 1000 

1151 gagatgctgtgagctcaaaccacaagcgagcccgagtaacgtagcattct 1200 

ioo! I " 1 lllllimill| iiiiiiiniiiii Mill II || 

1001 gagatg.tgtgagctcaaaccacaagcgagcccgagtaacgtagiattci 1049 

1201 tggtgtcaagcacttgtatgtccactacgctcctgcagctttcttcacca 1?50 

1050 III 'I" 1111 """I m il MMMMM 

1050 tggtgtcaagcacttgtatgtccactacgctcctgcagctttcttcgc ' 



:ca 1099 



1251 tggatcttttggactagtgaggtgagactggagaatagtaccattttgtt 1300 
http://eshpc02.e S .dupontxom:8000/project/training_kkl/re S ult/1107 S idl_gap^ 



Gap Results 

1inn i MI ' l ' IM,M 'llll I I I I I | | | | M | I I II I ! | | | | || ♦ 

1100 tggatcttttggactagtgag g tgagactggagaatagtaccat....it 1145 

1301 gattctcagttgctttgtgicgttggcta^caaccaacc^taagagagaa 1350 

1146 gattctcagttgctttgtgccgttggctaccaaccaaccttalgaglgaa 1195 

1351 ^aaatacaacagaacaggctaatatagtgttttgtatctgaacatctgg 1400 
1196 atall till 11111,1,1 111 11 """Mil MMMMII Ml I Ml 
1196 gtaaatacaacagaacaggctaatatagtgttttgtatctgaacatctgg 1245 

1401 cccatcgtacattcagtaaagcctataatagcgggca 
1 1 ' 1 1 1 1 1 1 1 ' I II ! I M | I I I I I | | M I I I I M I 
1246 "catcgtacattcagtaaagcctataatagcgggcatatatgtgctt 

1438 aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa 1471 

_ . 1 1 1 I I I I N M I I I I I I I I I I M M M I I II II I 

1296 -tgatcaccgatcagcaaaaaaaaaaaaaaaaaaaaaaaaaiaaaaaaaa 1345 

1472 aaa ' 

Ml 1474 

134 6 aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa 1395 



1437 
:tct 1295 
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Input Sequence: 1107sidl 



!NA_SEQUENCE 1.0 " " ~ — 

WPDEF Case 1107 SEQ ID NO- 1 

Case 1107 Rad51-like sequences. From SEQ LISTING 
cLIk Jei^ 119 ?^ AU9USt 28 ' 2001 15:55 T VP- N 

tgcgcagt'tc^ 930003 ^ CgtCC * CaCt tgactcccag tctcccactg 

51 _.^SS^ U. ggagcccca a a ggcggcgg tgagccggag 



View Sequence 



jGTl 

•imimm 



Input Sequence: 1107sid3 



http://eshpc02.es.dupont.com:8000/project/training_kkl/result/1107sidl 



_gap_42516.htm 



02/25/2002 
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! NA_SEQUENCE 10 ~ " " 

WPDEF Case 1107 SEQ ID NO: 3 &\ 
nn?J^ 7 T Rad5 Jr like sequences SEQ ID NO: 3 from SEQ LISTINgS 
Cneck 3152 en9 AUgUSt 28 ' 2001 15:56 Type: N 

agcggttgtg 09309 ^ 390 ggCtgcgtgg <=gccaccgac ggaggctacg 
51 gaggcagata ._^gaggtgg aggtggctac aacgggtcgg 



View Sequence 



http://eshpc02.es.dupontxom:8000/project/training_kkl/result/1107sidl 



_gap_42516.htm 



02/25/2002 



Gap Results 



Page 1 of 4 



Gap Results 



GAP of: llOTsidl check: 4817 from: 1 to: 1474 
WPDEF Case 1107 SEQ ID NO: 1 

Case 1107 Rad51-like sequences. From SEQ LISTING. 

to: 1107sid5, check: 9084 from: 1 to: 1333 
WPDEF Case 1107 SEQ ID NO * 5 

Case 1107 SEQ ID NO: 5 from SEQ LISTING . Rad51-like sequences 
Symbol comparison table: nws^p^cm^ CompCheck: 8760 



Gap Weight : 
Length Weight: 



50 Average Match: 10.000 

3 Average Mismatch: 0.000 



Quality: 13160 Length: 1474 

Ratio: 9.872 Gaps . Q 

Percent Similarity: 98.725 Percent Identity:' 98.650 

Match display thresholds for the alignment (s) : 
I = IDENTITY 
: = 5 
. = 1 



1107sidl x 1107sid5 



August 28, 2001 16:14 



101 acggcgcggcgcgactcccccctaagcgacagcggcggcgtcgacgtaag 150 

1 I I I I II I I I 

cgacgtaag 9 

151 JMctgcgtggcg^ 200 

10 cggctgcgtggcgccaccgacggaggctacgagcggttgtggaggcagat 59 
201 ^gagaggtggaggtggctacaacgggtcggcggctgtgagatactgaaa 250 
60 atgagaggtggaggtggctacaacgggtcggcggctgtgagatactgaaa 109 

251 tccgcactgcagttctcttcttcccccaatcagtaccacrtctccaagtg 300 
110 '' IIIM | IMIIMII| IMMIMMIMIMMMIMMMMMI 
110 tccgcactgcagttctcttcttcccccaatcagtaccacctctccaagtg 159 

301 ^aatcaccatgggagatcaatctggctrtagaaatgga^acaacagaa 350 

i6n ill ! 1111 111 " 111 11 11 1111 IHIII I II 

160 gcaatcaccatgggagatcaatctggctctagaaatggaccacaacagaa 209 

351 M???M????????? CagaatgCCt ^ ata tgttctctgatgagctgt 400 
210 ita^i' IIIMIMMIIMIIII| IMMMMIMIMMIIMM 
210 gtacgtttcaggagcccagaatgcctgggatatgttctctgatgagctgt 259 

401 cacagaaacacatcactactggttctggtgacctcaatgacatacttggt 450 
?fin MIIIIIIIIIIII '''MMMMMMIMMIMIIMMMIIMI 
260 cacagaaacacatcactactggttctggtgacctcaatgacaialiiggi 309 

451 gg ^ggattcactgcaaagaagttactgagatcggtggcgtcccaggggt 500 
11 1 1 1 1 1 1 1 1 1 1 1 1 I I M I I I || I I I II I | II II II II I II I I I I I | | || 

http://eshpc02.es.dupontxo m :8000/project/training_kkl/result/l 1 07sidl_gap_4253 1 .htm 



02/25/2002 



Gap Results 

A ^ Page 2 of 4 

310 ggcgggattcactgcaaag^gttactgagatcggtggcgtcccaggggt sW 

360 tggtaaaactcaactggggattcaactagcaatcaatgtacaaaiccili 409 

551 ^-tgtggtg^ 6Qo 

410 tggaatgtggtggccttggtgggaaagcagtttaiaiagailllglggii 459 

ACn 1 M 1 1 1 1 I I I I I M I I I I I I I M I I I I I I I I I | || | u I I I I I I I I II i I 
460 agtttcatggttgaacgtgtctaccagattgctgaagggigiliiliggl 5 09 

M M M m????M M????? agCCatgagaagtCCtC " Ct ^ cca - 700 
510 catactggagcactttccgcacagccatgagaagtcctcttctgtccaaa 559 

701 "caattacagcctgagcgtttcctggcggatatctattacttccggata 750 

Sfin MIII | MIIII| IIIIIIIIIIIIIIIM||||||IIII|||IIIIIM 

560 aacaattacagcctgagcgtttcctggcggatatctattacttccggatl 609 

M M M M M M?????M M??? gtCataaaCtaCatggagaa 9 ttcc t 800 
610 tgcagttacaccgaacaaattgcagtcataaactacatggagaagiiiii 659 
801 ^agcataaagatgtgcgta^ 850 

fiKn 11 1 11 I 1 I I I I I N I I II II I I I I I I I I I I I I I I | | | I I I I I I I I I I I I I 

660 cagagagcataaagatgtgcgtatagttattattgatagtgitaittici 709 

tn M m M???M M M???M? g9CaCtgaggaCCagagtgctaa ^ 900 

7in mil 1 1 1 1 1 1 1 m 1 1 1 m 1 1 n 1 1 1 1 1 1 1 1 m 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

710 actttcgacaagattttgaagatctggcactgaggaccagagtgctaagt 759 

901 ^"atcattgaagttaatgaagattgcaaagacatataacttggcagt 950 

760 ggattatcattgaagttaatgaagattgcaaagacatataac^ggiagt 809 

M M M?N M m???n??? a ? taaa " taCagaagggtcatttcaa t 1000 
810 ;'ti!!!!' IIIIIIMIIIIIIIIIIIM| l''l'IMMIIIIMI|| 
810 tgtcttgttgaaccaagtcactactaaatttacagaagggtcatttcaat 859 

1001 ^-^"^ctaggtgacagctggtcccactcatg 1050 
p«n I I i " ' ' ' 1 1 1 1 1 1 1 11 1 1 1 1 1 I I I I I I I I I I M I I I I I | | | I I I 
860 tgactcttgctctaggtgacagctggtcccactcatgcacgaaicgg^g 909 

1051 ^^^cactggaatgggaacgaacgatacgcacatcttgataagtctc; 1100 

93o l;;!; Nll i IIIIMIIMIIIM|||| iiiiiiiiMiiiiiiiiiii 

910 attctgcactggaatgggaacgaacgatacgcacatcttgataagtctcc 959 

1101 M M M M m M?M M?????? gtatgCagtgaCaggCaaagg g atta "50 
Qfin 11 11 ' " 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I I I I I I I N I I I I I | | | | | 
960 ttcacttccagtagcctcagcaccgtatgcagtgacaggcaaagggltti 1009 

1151 gagatgctgtgagctcaaaccacaagcgagcccgagtaaigtagcattc; 1200 

1mn 1 11 l| M ' I II I I I I I | | | | I I I I I I I I I I | I I I I I I I | | I I I I I 

1010 gag-tgctgtgagctcaaaccacaagcgagcccgagtaaigilgiliirt 1059 

1201 tggtgtcaagcacttgtatgtccactacg^tcctgcagc^ttcttcgcca 1250 

1060 'i'' 1 ' 1111111111111 !!!!!!!!!!!!!!!!) IMII|||||||| 

1060 tggtgtcaagcacttgtatgtccactacgctcctgctgctttcttcgcca 1109 

1251 tggatcttttggactagtgaggtgagactggagaatagtaccattttgtt 1300 

h tt p://eshpc02.e S .d^ vmsnwi 



Gap Results 



I ft I 



1110 tggatcttttggactagtgaggtgagactggagaatlg^ccattttg^ 1159 

1301 ^ttctcagttgctttgtgccgttggctaccaaccaaccttaagagaga; 1350 

1160 ^ttctcagttgctttgtgccgttggctaccaaccaaccttaagagagaa 1209 

1351 ^aaatacaacagaacaggctaatatagtgttttgtat rtg aacatct g g 1400 

1210 gtaaatacaacagaacaggctaatatagtgttttgtatctgiacatctgg 1259 

1401 C ^;^tacattcagtaaagcctataata g cgggcaaaaaaaaaaaaaa 1450 

• ||I ' III| IIIIIIIIII||||MIIIMIMII|| | | 
1260 ^catcgtacattcagtaaagcctataatagcgggcatatatgtgcttct 1309 

1451 aaaaaaaaaaaaaaaaaaaaaaaa 1474 

I I I I I I 1 I I I I I I I I [ I | | 
1310 ctgatcaaaaaaaaaaaaaaaaaa 1333 



Page 3 of 4 



Input Sequence: 1107sidl 



! ! NA_SEQUENCE 1 . 0 ~ ~" ■ — 

WPDEF Case 1107 SEQ ID NO • 1 

So7sid? 7 f*^" 11 ^ "fences. From SEQ LISTING . 
C^ 4817^ 28 ' 2001 15:55 T ^ e: N 

tgcgcagJtc^ 93 ^ 0309 CgtCCgCact tgactcccag tctcccactg 

5 j ^ gCttg g tccc cggagcccca aaggcggcgg tgagccggag 



View Sequence 



Input Sequence: 1107sid5 



! !NA_SEQUENCE 1 . 0 — _ . _ 

WPDEF Case 1107 SEQ ID NO- 5 

JequenJes ^ ^ 5 ^ SEQ LISTING - R ^51-li ke 

C^cktloe^ 1 1333 AUgUSt 28 ' 2001 16:00 T ^ « 

agc gg ttgtg C9aC9taagC ggctgcgt ^ cgccaccgac ggaggctacg 



View Sequence 



http://eshpc02. e s.d U pont.com:8000/project/training_kkl/result/1107sidl 



_gap_42531.htm 



□ 



02/25/2002 



Multiple Sequence Alignment Resi 

! !NA_MULTIPLE_ALIGNMENT 1.0 



uj^ 



Page 1 of 4 



Multiple Sequence Alignment Results 

Symbol comparison table: pileopdn, .r^n CompCheck: 6876 



GapWeight: 5 
GapLengthWeight : 1 



1107sidl_pileup 42431.txt 



MSF: 1611 Type: N August 28, 2001 16:08 



Name: 1107sidl 
Name: 1107sid5 
Name: 1107sid3 



// 



Len: 1611 Check: 421 Weight: 1 00 
Len: 1611 Check: 9483 Weight: 1 00 
Len: 1611 Check: 2846 Weight: 1 00 



Check: 2750 



50 



no^ii 1 !!!!!!!!!? !£!!!!!?! !?!l tcccag tctcccac ^agttc 

1107sid3 - ~ ^ 



51 



llTiliil !!!!! ggcgg ^^cggag 

1107sid3 — _ 



100 

cccggagacg 



101 

nodi's !!!!!?!!!! ??!!f!!ff? !!!!!?! gac ******* tcgacgtifg 

1107sid3 — ~ -^-^ ^ IIIII^" " ^cgacgtaag 

" ' — ~ "-cgacgtaag 

151 

uS?3W5 cnn? 9Cg f g gCgCCSCC ^ cggaggctac gagcggttgt ggaggcaga' 
1107^3 ? Cg 9 ^ cc ^cga cggaggctac gagcggttgt ggaggSga^ 
H07sxd3 cggc.gcgtg gcgccaccga cggaggctac gagcggttgt ggagg^ga^ 

201 

JlS^sldS Itltlt^l f; ggt ^ Cta caacgggtcg gcggctgtga gatactga'aa 

U07s^3 a^aJJJSa l* 99 ? 9 ™* ^^tcg gcggctgtga gatactgaaa 

S!d3 a.gagaggtg gaggtggcra caacgggtcg gcggctgtga gataci-gaaa 



251 

1107sidl tccgcac 
lK)7sid5 tccgcac 



tgc agttctcttc ttcccccaat cagtaccacc tctccaag^q 

1107sid3 tccqcac'oc ^^T 0 " CCCCcaat cagtaccacc tctccaagtg 
lOi tccgcac,gc ag^ctcttc ttcccccaat cagtaccacc tctccaagtg 

301 

UoSol T ^ cacaaca^ 

1 107sid3 g c,a tc acc, t ™ ^ ESSE! cac"c^ 

351 

nSSdS ««St* Ca g9agCCCaga ^gcctggga tatgttctct gatgagctgt 

110 Ld3 g^acat "c? 99agCCCaga ^gcctggga tatgttctct gatgagc gr 

s ld J gtacgtl,ca ggagcccaga atgcctggga tatgttctct gatgagctgt 

401 

iiS?a 8 W5 ZTalTaTat ^ Cac * act ^ctggtg acctca.tga catacttggt 

— -sss ssssss sssS 

http:// e shpc02.es.dupontxom:8000/project/training_kkl/resul^ 



02/25/2002 



Multiple Sequence Alignment Resuj^ 

"S?:js £s:"c a ^: acrgag atcggtggcg ^™t° 

1107sid3 ggcgggatlc a^caaaaa r^"" 9 * 9 3tCggtggcg tcccaggggt 
yy yyydutc ac.gcaaaga a gti.actgag atcggtggcg tcccaggggt 

501 

noSdJ , tggt3aaact caactgggga ttcaactagc aatcaatgta caaatcccaa" 
1107^ , tggtaaaact caactgggga ttcaactagc aatcaatgta caaa^cccaa 
1107 S1 d3 tggtaaaact caactgggga ttcaactagc aatcaatgta caaatcccag[ 

551 

uS?Iw5 taaaS 9 " 99 ! ggCC " ggt Mgaaagcag tttatataga tacagagjgc 

lWaitt t g a g aTl g l gg ?« cctt TOt yggaaagcag tttatataga tacagagggc 

1107sid3 tggaatgtgg tggccttggt gggaaagcag tttatat. . ...agagggc 

601 

iiS?SS fK? n 9 " 999t 9tatta999 ° 
"o,.u3 agtttMt ^ ; t ^2& 9 «--^ 

651 

U07.U3 „t«tS4 caetS^e ~£ 
701 

uSSdS aacSt' 3 " gCCtgagCgt ttcctggcgg atatctatta cttccgglta 
lloSd3 a rttlt gCCtgagc ^ ttcctggcgg atatctatta cttccggata 
H07 S1 d3 aacaattaca gcctgagcgt ttcctggcgg atatctatta cttccgglta 

nSSidS f g ' ag " taCa ^cgaacaaat tgcagtcata aactacatgg agaagt,^ 

™ = — :~~ — - 

801 

uSt^S ° agagagCat "agatgtgc gtatagttat tattgatagt gttactttcc 

1107sid5 cagagagcat aaagatgtgc gtaraq^^ — n-„ a t. a ^ II l^l 
1107 si d3 ca„ 9 a g cat a^.^c ££££ 

851 

UOT^dS a2"S^ gatCtggcac tgaggaccag agtgctaa^ 

1107sid5 a ^".'- CgaCa a 9 at tttgaa gatctggcac tgaggaccag agtgctaagt 

1107sxd3 actttcgaca agattttgaa gatctggcac tgacqaccaa ail "S 



901 



ggcac tgaggaccag agtgctaagt 

950 



110lT«l ggattatcat tgaagttaat gaagattgca aagacatata acttggcagj 

nS?»W3 qaat^tS ! gaag \ taat 9-gattgca aagacatata act-ggc^t 
1107sid3 ggat.atcat tgaagttaat gaagattgca aagacatata acttggcag* 



951 



Page 2 of 4 



nO^S t'St^SS a a a a c C c C r g t tCa tcatttcaat 
1107sid3 ^tct\atta a ^ CaagtCa «actaaatt tacagaaggg tcatttcaat 
5id3 tgtcttgttg aaccaagtca ctactaaatt. tacagaaggg tcatttcaat 

nO^idS 1 t'Sct'- 9C f taggtgac agctggtccc actcatgcac gaaccgg^tg 

-S2 «S :~ ~ 9 

1051 

H07sidl attctgcact ggaatgggaa egaacgatac gcacatcttg ataagtct'cc 
http://eshp C 02.es.dupontxom:8000/project/trainin g _kkl/result/n 02/25/2002 



Multiple Sequence Alignment Res 



uks 



SSSS ESSE ~S ™- sfe 



a^^t 



1101 



1150 



uS'^dS "cSS 2!'S Ca5 "^tatgc a„ 9 „ 8tta 

no7.id3 ttSSiis &s =:«^:^ c 



1151 



1107sidl gagatgctgt gagct-caa 
1107sid5 

1107sid3 gagatg.tgt 
1201 



agtgacaggc aaagggatta 



1200 



1107sid5 gagatg^gt aaactca'-^ ^ Caagcgag c ^gagtaac gtagcattct 
aJ, 2. g Z gagctcacl - c cacaagcgag cccgagtaac gi-agca*-tc<- 
gagctcaaac cacaagcgag cccgagtaac gtagcattct 



1250 

:cgcca 
:cgcca 

ictacgc tcctgcagct ttcttcgcca 

1300 




gtatg tec, 



ggactagtga ggtgagactg gagaatagta ccattttgt*- 
ggactagtga ggtgagactg gagaatagta ccatttfgtt- 
:tagtga ggtgagactg gagaatagta ccar . tr 



ggact 



1251 

1107sidl tggatctttt 
1107sid5 tggatctttt 
1107sid3 tggatctttt 

1301 

iisss SttScS list-Sit c9 ; t99ctac caacc -« t "<"™« 

— «s ™s t^ziii 

1351 



1107sidl 
1107sid5 
1107sid3 



gtaaatacaa 
gtaaatacaa 



1400 



cagaacaggc taatatagtg ttttgtatct gaacatctgg 
gtaaatacaa 1^^° taa tatagtg ttttgtatct gaacatctgg 
gtaaatacaa caga.caggc taatatagtg ttttgtatct gaacatct 



gg 

1450 

gcgggcaaaa aaaaaaaaaa 
gcgggcatat atgtgcttct 
gcgggcatat atgtgcttct 

1500 



1401 

J?n7 Si ^ CCCatcgtac attcagtaaa gcctataata 

J sccatcgtac attcagtaaa gcctataata 

1107 S1 d3 cccatcgtac attcagtaaa gcctataata 

1451 

1107sidl aaaaaaaaaa aaaaaaaaaa aaaa^^ 

]\ll 3 ^tl Ctgatcaaaa aaaaaaaaaa aaaa I Z~ 

H07 S1 d3 ctgatcaccg atcagcaaaa aaaaaaaaaa aaaaaaaaaa aaaa^ 

1501 

1107sidl ^ 1550 

H07sid5 HI lllT~ ~ 

H07sid3 aaaaaaaaaa aaaaaaaaaa ZZlZlll IZllZZl 

1551 

H07sidl — — _ 1600 

1107sid5 — < ~ HZ lllHZ~~~~~~ 

1107s±d3 aaaaaaaaaa aaaaaaaall ZIZZZZ ZIZIZZ 1ZZ1ZZ 

1601 1611 

1107sidl 

1107sid5 ,. 

1107sid3 aaaaaaaaaa a 
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Multiple Sequence Alignment Resi 



ute 

Multiple Sequence Alignment Dendrogram ^1 



Page 4 of 4 

ugust 28, 2001 16: 0£ 



CO 

— 4 

Ol 



— + 



o 

CO 
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Page 1 of 2 

Gap Results ~ 

GAP of: 1107sid2 check: 4041 from: 1 to: 294 
WPDEF Case 1107 SEQ ID NO: 2 

Case 1107 SEQ ID NO: 2 from SEQ LISTING . Rad51-like sequences. Protein 

to: 1107sid4 check: 3715 from: 1 to: 281 
WPDEF Case 1107 SEQ ID NO- 4 

Case 1107 SEQ ID NO: 4 from SEQ LISTING . Rad 51-like sequences. Protein 

bI™™^ 0 ^ 1 * 0 " tablS: CompCheck: 6430 

BLOSUM62 ammo acid substitution matrix 

Reference: Henikoff S . and Henikoff, j. G . (1992) . ^ ino acid 

substitution matrices from protein blocks. Proc. Natl Acad 
Sci. USA 89: 10915-10919. Acad. 

Gap Weight: 8 Average Match: 2.912 

Length Weight: 2 Average Mismatch: -2.003 

Quality: 1449 Length: 294 

Ratio: 5.157 G a 1 

Percent Similarity: 99.644 Percent Identity:* 99.644 

Match display thresholds for the alignment (s) : 
I = IDENTITY 
: = 2 
. = 1 

1107sid2 x llOTsidi August 28, 2001 16:15 

1 MGDQSGSRNGPQQKYVSGAQNAWDMFSDELSQKHITTGSGDLNDILGGGI 50 

, I I I I I I I I I | | | | | I I I I I I I I I I I M I I I I I | M I I I I I I I I I I 

1 MGDQSGSRNGPQQKYVSGAQNAWDMFSDELSQKHITTGSGDLNDILGGGI 50 

51 HCKEVTEIGGVPGVGKTQLGIQLAINVQIPVECGGLGGKAVYIDTEGSFM 100 
c , 11 " 1 1 1 1 1 1 1 11 I I I I I I I I I | | | | | | | | | | | | , | ,,,,,, , , , , , , 

51 HCKEVTEIGGVPGVGKTQLGIQLAINVQIPVECGGLGGKAVYI . . EGSFM 98 

101 yERVYQIAEGCIRDILEHFPHSHEKSSSVQKQLQPERFLADIYYFRICSY 150 
.' ' ' 1 ' ' 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I N I I I I I I I I I I I I I I I I I I II I I | || | 



99 VERVYQIAEGCIRDILEHFPHSHEKSSSVQKQLQPERFLAE) 



201 



IYYFRICSY 148 

151 TEQIAyiNYMEKFLREHKDVRIVIIDSVTFHFRQDFEDLALRTRVLSGLS 200 
i a q ir.1 ' 'I'!!!!!! I I I I I I I M I M I I I I I | | | | | | | I I I I I I | 

149 TEQIAVINYMEKFLREHKDVRIVI IDSVTFHFRQDFEDLALRTRVLSGLS 198 

LKLMKIAKTYNLAyVLLNQVTTKFTEGSFQLTLALGDSWSHSCTNRLILH 250 

199 LKLMKIAKTYNLAVVLLNQVTTKFTEGSFQLTLALGDSWSHSCTNRLILH 248 

251 WNGNERYAHLDKSPSLPVASAPYAVTGKGIRDAVSSNHKRARVT 294 
0/) IIIIIIIIIIMIIIIIMIIMIIMIIIII 

24 9 WNGNERYAHLDKSPSLPVASAPYAVTGKGIRDV 2 81 



Input Sequence: 1107sid2 

http://eshpc02.es.dupont.com:8000/project/training_ldcl/result/1107sW^^^^ 02/25/2002 



Gap Results 



!AA_SEQUENCE 1.0 ' — ■ 

WPDEF Case 1107 SEQ ID NO: 2 

Case 1107 SEQ ID NO: 2 from SEQ LISTING . Rad51-like 
sequences. Protein 

c;eck^0 4 l Le ? gth: AUgUSt 28 ' 2001 16:05 p 

DLNDILGGGI MGDQSGSRNG PQQKYVSGAQ NAWDMFSDEL SQKHITTGSG 



View Sequence 



Input Sequence: 1107sid4 



!AA_SEQOENCE 1.0 

WPDEF Case 1107 SEQ ID NO: 4 

Case 1107 SEQ ID NO: 4 from SEQ LISTING. Rad 51-like 
sequences. Protein e 

Cneckf 3715 Length: AUgUSt 28 ' 2001 16:06 Type: P 

DLNDILGGGI MGDQSGSRNG PQQKYVSGAQ NAWDMFSDEL SQKHITTGSG 



View Sequence 



http://eshpc02.es.dupont.com:8000/project/training_kkl/result/l 1 07sid2 



_gap_41611.htm 



Gap Results 



Gap Results 



GAP of: 1107sid2 check: 4041 from: 1 to: 294 



WPDEF Case 1107 SEQ ID NO- 2 

Case 1107 SEQ ID NO: 2 from SEQ LISTING. Rad51-li ke sequences. Protein 



to: 1107sid6 check: 4041 from: 1 to: 294 



WPDEF Case 1107 SEQ ID NO- 6 

Case 1107 SEQ ID NO: 6 from SEQ LISTING. Rad51-like sequences. Protein. 

m^ML COmPariS ° n tabl8: bl °sum62.cmp CompCheck: 6430 
BLOSUM62 amino acid substitution matrix 

Reference: Henikoff, S. and Henikoff, J. G . (1992). Amino acid 

substitution matrices from protein blocks. Proc. Natl Acad 
Sci. USA 89: 10915-10919. 



Gap Weight: 
Length Weight : 



8 Average Match: 2.912 

2 Average Mismatch: -2.003 



Quality: 1530 Length: 294 

Ratio: 5.204 Gaps . 0 

Percent Similarity: 100.000 Percent Identity: 100.000 

Match display thresholds for the alignment (s) : 

I = IDENTITY 

: = 2 

. = 1 



1107sid2 x 1107sid6 



August 28, 2001 16:16 



Page 1 of 2 



1 MGDQSGSRNGPQQKYVSGAQNAWDMFSDELSQKHITTGSGDLNDILGGGI 50 

, 11 1 1 1 1 1 1 " 1 1 N I M M I II I I I I I I | | | | | | | M I I I I I I I I I II ! I 

1 MGDQSGSRNGPQQKYVSGAQNAWDMFSDELSQKHITTGSGDLNDILGGGI 50 

51 HCKEVTEIGGVPGVGKTQLGIQLAINVQIPVECGGLGGKAVYIDTEGSFM 100 

_ 'J,' 1 1 1 I ' I I I I I I I I I I I I I I I I I I I I I I I I | | | | | | | | | | | | | | | | | | 

51 HCKEVTEIGGVPGVGKTQLGIQLAINVQIPVECGGLGGKAVYIDTEGSFM 100 

101 VERVYQIAEGCIRDILEHFPHSHEKSSSVQKQLQPERFLADIYYFRICSY 150 

1ni J I 1 1 I N N I I I I I II I I I I | | | I I I I I | | | | | | | | I I I I I | | | | | | | | | 

101 VERVYQIAEGCIRDILEHFPHSHEKSSSVQKQLQPERFLADIYYFRICSY 150 

151 TEQIAVINYMEKFLREHKDVRIVIIDSVTFHFRQDFEDLALRTRVLSGLS 200 
m i!'!l IIIMMMIIIIII| IIIIIIIIMIIMIIIIII|||||||ll 
151 TEQIAVINYMEKFLREHKDVRIVIIDSVTFHFRQDFEDLALRTRVLSGLS 



200 



201 



LKLMKIAKTYNLAVVLLNQVTTKFTEGSFQLTLALGDSWSHSCTNRLILH 250 
o„, 1 1 1 ' 11 1 1 ' 1 ' I I 1 1 I I I I N I I II I I I I I I I I I I I I I I I I I I I I I I I I I 
201 LKLMKIAKTYNLAVVLLNQVTTKFTEGSFQLTLALGDSWSHSCTNRL] 



,ILH 250 



251 



WNGNERYAHLDKSPSLPVASAPYAVTGKGIRDAVSSNHKRARVT 294 
oc1 1 11 11 " I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I 
251 WNGNERYAHLDKSPSLPVASAPYAVTGKGIRDAVSSNHKRARVT 294 



Input Sequence: 1107sid2 

http://eshpc02.es.dupont.co m :8000/project/training_kkl/result/1107sid2_gap_42536.htm 
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! !AA_SEQUENCE 1.0 — _ 

WPDEF Case 1107 SEQ ID NO: 2 

Case 1107 SEQ ID NO: 2 from SEQ LISTING . Rad51-like 
sequences . Protein 

1107sid2 Length: 294 August 28, 2001 16:05 Type* P 
Check: 4041 . . yjy 

DLNDILGGGI MGDQSGSRNG PQQKYVSGAQ ^AWDMFSDEL SQKHITTGSG 



View Sequence 



Input Sequence: 1107sid6 



! !AA_SEQUENCE 1.0 ~" ~~~ 

WPDEF Case 1107 SEQ ID NO: 6 

Case 1107 SEQ ID NO: 6 from SEQ LISTING. Rad51-like 
sequences . Protein . 

1107sid6 Length: 294 August 28, 2001 16:07 Type- P 
Check: 4041 .. ys ^ 

DLNDILGGGI MGDQSGSRNG PQQKYVSGAQ NAWDMFSDEL SQKHITTGSG 



View Sequence 



http://eshpc02.es.dupont.com:8000/project/training_kkl/result/l 1 07sid2_gap 42536.htm 
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Multiple Sequence Alignment Res 

! AA_MULT I PLE_AL I GNMEN T 1 . 0 



U^ 



Page 1 of 2 



Multiple Sequence Alignment Results 



Symbol comparison table: bl 

GapWeight : 8 
GapLengthWeight : 2 

1107sid2_pileup_42538.txt MSF: 294 



osumj^cmp CompCheck: 64 30 



Name: 1107sid2 
Name: 1107sid6 
Name: 1107sid4 



Type: P August 28, 2001 16:10 Check; 



Len : 
Len : 
Len : 



294 
294 
294 



Check: 4041 
Check: 4041 
Check: 9726 



// 



Weight: 1.00 
Weight: 1.00 
Weight:. 1.00 



SKSS K£2S£ NAWDMFSDEL SQKHITTGSG DLNDILGGGI 

1107sid4 MGdSSrS Kk^S^ f" SDEL SQKHITTGSG DLNDILGGGI 
WbbbR " U PwQKY,SoAw wAWuMFSDEL SQKHITTGSG DLNDILGGGI 

51 

1107sid2 HCKEVTEIGG vPG ; ..'C'KT r '> T C » THrn , n 100 
1107sid6 HCKEVTE-CC "P".'^ ^LAINVQ.P vECGGLGGKA VYIDTEGSFM 
1107sid4 HTKFVT^" " KTQ ^ AQLAINVQIP VECGGLGGKA VYIDTEGSFM 

° Sld4 HCKE/TEi ^ vPGViiKTQLG IQLAINVQIP VECGGLGGKA VYI . . EGSFM 

101 

imSS S532S c-"r ML S^ ?? HEKSSSva KQLaraR ™ «wwS 

.c.i.f Ho.UKSSSvQ KQLQPERFLA DIYYFRICSY 

151 

1107sid2 TEQIAVINYM EKFLREHKDV RTVtidSVTF HFRPn-rnr a t ^ 200 
1107sid6 TEQIAVINYM EK^R-FR-. HFRQDtEDLA LRTRVLSGLS 

1107sid4 TEQIAVINVM EK-R-M^ RJ^IDSVTF HFRQDFEDLA LRTRVLSGLS 
1*.U1A\,IN.M EKt^hKDv RIVIIDSVTF HFRQDFEDLA LRTRVLSGLS 

201 

1107sid2 LKLMKIAKTY NLAVVI 
1107sid6 LKLMKIAKTY NLAVVI 
H07sid4 LKLMKIAKTY NLAVVI 



250 

-NQV TTKFTEGSFQ LTLALGDSWS HSCTNRLILH 
LNQV TTKFTEGSFQ LTLALGDSWS HSCTNRLILH 
LNQV TTKFTEGSFQ LTLALGDSWS HSCTNRLILH 



251 

K !!^ NERYAHL DKSPSLPVAS APYAVTGKGI" RDAVSSNHKR ARv T 

88S &SS SS5 55S »ll 



7808 



http://eshpc02.es.dupont.co m :8000/project/trainin g _kkl^ 
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Multiple Sequence Alignment Resu 



Multiple Sequenfe Alignment Dendrogram ^ 



Page 2 of 2 



ugust 28, 2001 16: K 



O 

0) 
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Cl 



o 
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o 
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APPENDIX B 



.J 0 ^ Page 1 of 3 

Gap Results 

GAP of: 1107sidl check: 4817 from: 1 to: 1474 
WPDEF Case 1107 SEQ ID NO: 1 

Case 1107 Rad51-like sequences. From SEQ LISTING. 

to: ac002387cds check: 3310 from: 1 to: 999 

WPDEF Case 1107 At Rad51 
AC002387 chromosome 2. 
Locus AAB82635 
GI 2583126 

Symbol comparison table: n^gapjna_._cmp. CompCheck: 8760 

Gap Weight: 50 Average Match: 10.000 

Length Weight: 3 Average Mismatch: 0.000 

Quality: 5743 Length: 1475 

Ratio: 5.749 Gaps . 4 

Percent Similarity: 60.120 Percent Identity: 60.120 

Match display thresholds for the alignment (s) : 
I = IDENTITY 
: = 5 
. = 1 

1107sid_l x ac002387cds August 30, 2001 11:35 



151 cggctgcgtggcgccaccgacggaggctacgagcggttgtggaggcagat 200 
1 III 111(11 

atgatttcatttgggcggcgta 22 

201 atgagaggtggaggtggctacaacgggtcggcggctgtgagatactgaaa 250 
1 1 I I I i III I I I I I I 

23 aatcgccggcgattgaagaaacttcactcgcgacttcagtcatggaggca 72 

251 tccgcactgcagttctcttcttcccccaatcagtaccacctctccaagtg 300 
73 tggaggttaccgttatcgccttcgatta gaggaaaact 110 

301 gcaatcaccatgggagatcaatctggc.tctagaaatggaccacaacaga 349 

1 I I I I MM I II I | 

111 gatatcggccggttatacttgtctgtcttcgattgcttccgtctcttctt 160 

350 agtacgtttcaggagcccagaatgcctgggatatgttctctgatgagctg 399 

1fil ' ' 1 1111 MM II | in | || I,, J 

161 ctgatctcgctcgagcaaagaacgcttgggatatgcttcacgaggaggag 210 

400 tcacagaaacacatcactactggttctggtgacctcaatgacatacttgg 449 
211 i' ' 1 1111111 'I Ml II II MM | m 
211 tctttgccgcgtattactacatcttgctctgatcttgataacattttggg 260 

450 tggcgggattcactgcaaagaagttactgagatcggtggcgtcccagggg 499 

9fi , "J 1 III III I II Mill IMM MM, || Mini 

261 cggtggaattagctgtagggatgttacagagattggtggggtaccaggga 310 

http://eshpc02.es.dupont.com:8000/project/training_kkl/result/1107 S idl_gap_4^ 



02/25/2002 



^Gap Results 

500 ttggtaaaactcaactgg^ttcaactagcaatcaatgtacaaatccca ft 
,n /!' ' " 1 1 11 1 1 I I I I I I I | | | I I I M I M I I | | | 
311 "ggcaagactcagattgggatccagctctctgtgaatgttcagattcca 360 

550 gtggaatgtggtggccttggtgggaaagcagtttatatagatacagaggg 599 

... N Mil I IN inn ,|| | | m I Mill MIIMM II 

361 cgtgagtgtggtggtcttggagggaaagctatatatatcgatacagaagg 410 

600 cagtttcatggttgaacgtgtctaccaga^tgctgaagggtgtattaggg 649 
11 MIIMM II MM I I I I I | || | | | | | | | I I, 

411 tagcttcatggtggagcgtgctttacagatagcagaagcttgtgtagagg 460 

650 ^atactggagcactttccgcacagccatgagaagtcctcttctgtccaa 699 
a en L '" IN I M I || | Mi 

461 acatggaagaatacacaggatacatgcataaacattttcaagcaaatcaa 510 

700 ^acaattacagcctgagcgtttcctggcggatatctattacttccggat 749 
mi M I I N II I II I I I I II I MMMM I 
511 gtacaaatgaaaccagaagatatcttagagaacatattctacttccgtgt 560 

750 a tgcagttacaccgaacaaattgcagtcataaactacatggagaagttcc 799 

.„ I N I I I I I | | M I I Mill III I I || I I I I | | MM 

561 ctgcagttacaccgagcaaatcgcattggtcaatcatcttgaaaag^ca 610 

800 tcagagagcataaagatgtgcgtatagttattattgatagtgttacttt^ 849 

fi11 IL J' ' " IIM " MM! I || III | || || 

611 tctctgaaaacaaagatgt...agttgtaatcgtagacagtatcaccttt 657 



Page 2 of 3 



850 ^tttcgacaagatttt^ 899 
tcaggactatgatga. 
cattgaagttaatga, 

I I I I M II M II I I II II | | | | iii 



658 catttccgtcaggac^atgatgacttagcccagaggacacgagtgitcag 707 

900 tggattatcattgaagttaatgaagattg^aaagacatataacttggcag 949 
7n9 ' 1 I I I M I II II I I II II || | , I,, ? 

708 cgaaatggctttaaagttcatgaagcttgccaaaaagttctcacttgcgg 757 

950 ttgtcttgttgaaccaagtcactactaaatttacagaagggtcatttca; 999 
7qp I ' ' ' 1 1 I 1 I I I I I I I I I II M M I || | | || II M M 
758 tcgtgttactaaaccaggtgaccacaaagtttagtgaaggc^gtttiaa 807 

1000 ttgact,cttgctcta g gtgac a gct gg tc;cactcatgcacgaaccggtt 1049 

snp ' 1 1 1 1 I 1 1 I I N II II II I I II || || Ml || | 

808 ctagcgcttgctttaggcgatagctggtctcattcgtgcaccaaicgagt 857 

1050 gattctgcactggaatgggaacgaacgatacgcacatct^gataagtct; 1099 
1,1,11 I I I I II I M II MM!! I I i i I i i i i i i , 

858 cattctgtattggaatggtgatgagigtiaigiatatatcgaiaagtici 907 

noo Mr gt t?rir?r c r 9 M tg r a n g rrT? a ?r a r 1149 

908 cttcacttccttcagcttcggcttcatacactgtaaccagtagaggtcta 957 

1150 ^gagatgctgtgagctcaaaccacaagcgagcccgagtaacgtagcattc 1199 
o ' M ' I ''I I M II I II I I Mill 

ys« agaaactc. • ■ atcctcgagtagcaagcgagtcaagatgatgt; 



:aa 999 



Input Sequence: 1107sidl 

h tt p://eshpc02.es.dupontxom:8000/project/training_kkl/result/1107^ 02/25/2002 



ap Results 
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! ! NA^SEQUENCE 1.0 — "~ 

WPDEF Case 1107 SEQ ID NO: 1 

Case 1107 Rad51-like sequences. From SEQ LISTING 
Check- ^Sn 6119 ^ 1 1474 AugUSt 28 ' 2001 15:55 T ype: N 

1 tcgacccacg cgtccgcact tgactcccag tctcccacto 
tgcgcagttc y 

- 5 1 gcttggtccc cggagcccca aaggcggcgg tgagccggag 



View Sequence 



Input Sequence: ac002387cds 



• !NA_SEQUENCE 1.0 

WPDEF Case 1107 At RadSl 
AC002387 chromosome 2. 
Locus AAB82635 
GI 2583126 

ac002387cds Length: 999 August 30, 2001 11:30 Type* N 
Check: 3310 . . r 

1 atgatttcat ttgggcggcg taaatcgccg gcgattgaaq 
aaacttcact ^ 



View Sequence 



http://eshpc02.es.dupontxom:8000/p^ 



02/25/2002 



FrameAlign Results ^ Page 1 of 3 

FrameAlign Results 

Local alignment of: 1107sidl check: 4817 from: 1 to: 1474 
WPDEF Case 1107 SEQ ID NO : 1 

Case 1107 Rad51-like sequences. From SEQ LISTING. 

to: 1107ac002387pep check: 9453 from: 1 to: 332 

WPDEF Case 1107 At RadSl protein 
AC002387 protein 
Locus AAB82635 
GI 2583126 
Case 1107 RadSl 

Scoring matrix: blosum62 . cmp 
CompCheck: 1102 
BLOSUM62 amino acid substitution matrix. 

Reference: Henikoff, S. and Henikoff, J. G. (1992). Amino acid 

substitution matrices from protein blocks. Proc. Natl. Acad. 
Sci. USA 89: 10915-10919. 
Translation table: transl table Ol.txt 

translatable = 1 

This file contains the Standard Code specified in the feature table 
definition, Version 1.08, formatted for use with the GCG programs 
(Data Files volume of the Data Reference Set) . It names amino acids in 
both one and three -letter form and lists the codons which should 
translate into them. All GCG translation programs may generate their . 



Gap Weight 
Length Weight 
Frameshift Weight 

Quality 
Ratio 

Percent Similarity 



8 Average Match: 2.778 

2 Average Mismatch: -2.248 
0 

982 Length: 813 

3.637 Gaps: 1 

78.519 Percent Identity: 67.037 



Match display thresholds for the alignment (s) 
| = IDENTITY 
: = 2 
. = 1 



1107sidl x 1107ac002387pep February 25, 2002 17:01 



364 gcccagaatgcctgggatatgttctctgatgagctgtcacagaaacacat 413 
M ••-IIIIMIIIIIM 1 :::||| III || 

59 AlaLysAsnAlaTrpAspMetLeuHisGluGluGluSerLeuProArgll 75 

414 cactactggttctggtgacctcaatgacatacttggtggcgggattcact 463 

„ I Ml IN MUM IMIMIIIIIIIIIIII I 

76 eThrThrSerCysSerAspLeuAspAsnlleLeuGlyGlyGlylleSerC 92 
464 gcaaagaagttactgagatcggtggcgtcccaggggttggtaaaactcaa 513 

M::::::|IIIII|||||||MIIIMIIII|||:::||||||MIIII 
93 ysArgAspValThrGluIleGlyGlyValProGlylleGlyLysThrGln 108 

514 ctggggattcaactagcaatcaatgtacaaatcccagtggaatqtqqtqq 563 

, nQ •-IMMMIIIII...::: |||||||||| ||||||||||| 

109 HeGlylleGlnLeuSerValAsnValGlnlleProArgGluCysGlyGl 125 



http://eshpc02.es.dupontxom:8000/project/training_ldcl/result/1107ac002387_framea_35386.htm 02/25/2002 



FrameAlign Results 

564 ccttggtgggaaagcagtftatagatacagagggcagtttcatggttg H 

1 1 I II I I I I I I I I I I ::: I M I I I I I I I I I I I I I I I I I I I I I I 

126 YLeuGlyGlyLysAlalleTyrlieAspi^ 142 

614 itMr tcta iiftnfrnirifr tta99 f r act9 nr ac 663 

143 luArgAlaLeuGlnlleAlaGluAlaCysValGluAspMetGluGluTyr 158 
664 tttccgcacagccatgagaagtcctcttctgtccaaaaacaattacagcc 713 
159 ThrGlyTyrMetHisLysHisPheGlnAlaAsnGlivalGiiiMetLysii 175 
714 tgagcgtttcctggcggatatctattacttccggatatg^agttacaccg 763 

193 luGlnlleAlaLeuValAsnHisLeuGluLyiiielleSerGiiAsniy] 208 

814 gatgtgcgtatagttattattgatagtgttactttccactttcgacaaga 863 
™ 111111 : : H I I I I I I I I I I I I I I I I I I I I I I I 

209 AspVal...valVallleValAspsI±llX^^ 224 

864 "ttgaagatctggcactgaggaccagagtgctaagtggattatcattga 913 
„, MIMMIIIIIIIIMI INI 

225 PTyrAspAspLeuAlaGlnArgThrArgValLeuSerGluMetAlaieuL 241 

914 agttaatgaagattgcaaagacatataacttggcagttgtcttgttgaac 963 
o.o " . I I I I I h = :| I I | | | :: = ---l|||||||||lllllllllll 
242 ysPheMetLysLeuAlaLysLysPheSerieiAlaiaiiaiiliieiAin 257 

964 «agtcactactaaatttacagaagggtcatttcaattgactcttgctct 1013 

Mllllllllll ... I I I | I I I I I I I II I I I I llllllll 

258 MnValThrThrLysPheSe^^^^ 274 

1014 ^gtgacagctggtcccactcatgcacgaaccgg 1063 
,, c M nillllllllllM MM llll|...||||||:- -1111 

275 uGlyAspSerTrpSerHisSerCysTirAsiArgVainiiiiTyrilpl 291 

1064 ^gggaacgaacgatacgcacatcttgataagtctccttcacttccagta 1113 

1114 rnnifii^in 9 "!??^""??!!! 1 "?!? 9 " 9 " 9 ' 3 ?? 1163 

308 ««SarMaSerTyrThrValThrSerarg01yI. e uAr9A sn serSersi 324 

1164 ctcaaaccacaag 1176 
INI... ::: 
325 rSerSerLysArg 32 8 



Input Sequence: 1107sidl 

http://eshpc02xs.dupont.com:8000/project/training_kkl/re S ult/1107ac002387 framea 



FrameAlign Results 



! !NA_SEQUENCE T. 0 
WPDEF Case 1107 SEQ ID NO : 1 
Case 1107 Rad51-like sequences. 
1107sidl Length: 1474 
Check: 4 817 



From SEQ LISTING. 
August 28, 2001 15:55 Type: N 



1 tcgacccacg cgtccgcact tgactcccag tctcccactq 
tgcgcagttc a 

9 ctt 99tccc cggagcccca aaggcggcgg tgagccggag 

| View Sequence J 



Page 3 of 3 



Input Sequence: 1107ac002387pep 



! !AA_SEQUENCE 1.0 ~™ : p1 

WPDEF Case 1107 At RadSl protein 
AC002387 protein 
Locus AAB82635 
GI 2583126 
Case 1107 RadSl 

ac002387pep Length: 332 August 30, 2001 11:32 Type- P 
Check: 9453 . . r ^ 

. 1 MISFGRRKSP AIEETSLATS VMEAWRLPLS PSIRGKLISA 



View Sequence 



http://eshpc02xs.dupontxom:8000/project/training_kkl/result/1107ac002387_^^^ 02/25/2002 



APPENDIX C 



PDB Query Result ^ Page 1 of 1 

®FS^ Oiierv Reeult Browser ® ® 



_ Query Result Browser H| pnRH rt# 

PROTEIN DATA BANK F Heifi PDBHome Contact us 

Your query found 45 structures in the current PDB release and you have selected 6 structures so far. Only the 
selected structures are currently shown. To examine an individual structure select the Explore link! 



Pull down to select option: [ New Search FJ [<3o 



F 1EW1 Deposited: 21-Apr-2000 Exp. Method: NMR, 10 Structures { EXPLORE } 

Title Reca Protein-Bound Single-Stranded DNA 

Classification Deoxyribonucleic Acid 

Compound Moljd: 1; Molecule: DNA (5'-D(TpApCpG)-3'); Chain: A; Engineered: Yes; Other_Details: 

Reca Protein-Bound Single-Stranded DNA 
F 1G1 8 Deposited: ll-Oct-2000 Exp. Method: X-ray Diffraction Resolution: 3.80 A { EXPLORE } 

Title Reca-ADP-Alf4 Complex 

Classification Hydrolase 

Compound Mol_Id: 1; Molecule: Reca Protein; Chain: A; Synonym: Recombination Protein Reca; Ec: 
3.4.99.37 

F 1G1 9 Deposited: ll-Oct-2000 Exp. Method: X-ray Diffraction Resolution: 3.00 A { EXPLO RE } 

Title Structure Of Reca Protein 

Classification Hydrolase 

Compound Mol_Id: 1; Molecule: Reca Protein; Chain: A; Synonym: Recombination Protein Reca; Ec: 
3.4.99.37 

F 1REA Deposited: 19-Dec-1991 Exp. Method: X-ray Diffraction Resolution: 2 JO A { EXPLORE } 

Title Structure Of The Reca Protein- ADP Complex 

Classification DNA Binding Protein 

Compound Reca Protein (E.C. 3.4.99.37) Complex With Adenosine Diphosphate (Reca-ADP) 

F 2 REB Deposited: 06-Mar-1992 Exp. Method: X-ray Diffraction Resolution: 2.30 A { EXPLORE } 

Title The Structure Of The E. Coli Reca Protein Monomer and Polymer 

Classification DNA Binding Protein 

Compound Reca Protein (E.C. 3.4.99.37) 



© RCSB 



http://ww.rcsb.org/pdb/cgi/resu^ 02/28/2002 



Proc. Natl. Acad. Sci. USA 

Vol 94, pp. 6623-6628, June 1997 

Biochemistry 

An extended DNA structure through deoxyribose-base stacking 
induced by RecA protein 

(homologous genetic recombination/NMR/NMR spectroscopy/transferred nuclear Overhauser effect) 
TARO NlSHINAKA*t, YUTAKA ITO*, SHIGEYUKI YOKOYAMAtt, AND TAKEHIKO SH1BATA*§ 

'Cellular and Molecular Biology Laboratory, and ^Cellular Signaling Laboratory, The Institute of Physical and Chemical Research (R1KEN), Saitama 351-01, 
Japan; and ^Department of Biophysics and Biochemistry, Graduate School of Science, The University of Tokyo, Tokyo 1 13, Japan 

Communicated by Charles R. Canton Boston University, Boston, MA, April 14, 1997 (received for review December 2, 1996) 



ABSTRACT The family of proteins that are homologous 
to RecA protein of Escherichia coli is essential to homologous 
genetic recombination in various organisms including viruses, 
bacteria, lower eukaryotes, and mammals. In the presence of 
ATP (or ATPyS), these proteins form helical filaments con- 
taining single-stranded DNA at the center. The single- 
stranded DNA bound to RecA protein is extended 1.5 times 
relative to B-form DNA with the same sequence, and the 
extension is critical to pairing with homologous double- 
stranded DNA. This pairing reaction, called homologous 
pairing, is a key reaction in homologous recombination. In this 
NMR study, we determined a three-dimensional structure of 
the single-stranded DNA bound to RecA protein. The DNA 
structure contains novel deoxyribose-base stacking in which 
the 2' -methylene moiety of each deoxyribose is placed above 
the base of the following residue, instead of normal stacking 
of adjacent bases. As a result of this deoxyribose-base stack- 
ing, bases of the single-stranded DNA are spaced out nearly 5 
A. Thus, this novel structure well explains the axial extension 
of DNA in the RecA-filaments relative to B-form DNA and 
leads to a possible interpretation of the role of this extension 
in homologous pairing. 



Homologous genetic recombination plays critical roles in both 
evolution and maintenance of a functional genome. RecA 
protein is essential to homologous recombination in Esche- 
richia coli (1, 2), and promotes ATP-dependent joint-molecule 
formation from homologous double-stranded DNA and sin- 
gle-stranded DNA through "homologous pairing" in vitro (3, 
4). Homologous pairing by RecA protein has been extensively 
studied for more than a decade. How single-stranded DNA 
recognizes sequence homology in double-stranded DNA has 
been a central question in these studies. Based on studies using 
chemical probing, electron microscopy, modification of base 
sequences, mutant RecA proteins, and others, various models 
such as triplex formation have been proposed to explain the 
mechanism of recognition of homology (see refs. 5-12 for 
reviews). However, little information is available on the three- 
dimensional structures of DNA during homologous pairing, 
information that is essential for a clear view of the mechanism 
of homologous recognition. 

At the first stage of homologous pairing, RecA protein binds 
to single-stranded DNA in the presence of ATP, and then 
double-stranded DNA binds to the nucleoprotein complex for 
searching for homology (13, 14). Electron microscopic studies 
revealed that RecA protein forms helical filaments on the 
single-stranded DNA. Biochemical studies showed that such 
filaments formed in the presence of ATP ("presynaptic fila- 
ment") are molecular machines for homologous pairing of the 
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single-stranded DNA in the filaments with naked double- 
stranded DNA that is then taken up into the filament (14, 15). 
Under certain conditions, RecA protein forms a filament on 
double-stranded DNA, whose shape is very similar to that of 
the filament formed on single-stranded DNA (14, 16, 17). In 
these RecA filaments, both single-stranded and double- 
stranded DNA are extended 1.5 times as compared with 
B-form DNA. In spite of low degrees of amino acid sequence 
homology, eukaryotic homologs of RecA protein, the Rad51 
proteins from Saccharomyces cerevisiae and Homo sapiens, and 
the functional homolog UvsX protein from coli-phage T4 form 
helical nucleoprotein filaments that have a shape that is nearly 
identical to bacterial RecA protein, as revealed by electron 
microscopy (18-20). In the experiments described here, we 
determined a three-dimensional structure of single-stranded 
DNA bound to RecA protein, which revealed a novel stacking 
of deoxyribose and bases. 

MATERIALS AND METHODS 

Oligodeoxyribonucleotides and RecA Protein. Oligonucle- 
otides were synthesized on a DNA synthesizer (EXPEDITE; 
Millipore) followed by the purification with reversed-phase 
column cartridges (Oligo-pak SP; Millipore), or purchased 
from Cruachem (Kyoto) or Genset (Tokyo). Undesirable 
organic impurities and metal ions were removed by using 
cation ion exchange resins (AG 50W-X8, Chelex 100; Bio- 
Rad). The purified oligonucleotides were lyophilized rapidly 
and stored at -20°C. DNA concentrations were determined by 
absorbance measurements at 260 nm and are expressed in 
moles of entire molecules rather than moles of nucleotide 
residues. 

RecA protein was purified as described by Shibata et ah (21, 
22), with a minor modification, and dialyzed against 20 mM 
Tris*Cl (pH 7.5) buffer containing 6.7 mM MgCl 2 and 150 mM 
NaCl. By use of ultrafiltration, we concentrated RecA protein 
and replaced the solvent by a deuterium buffer {20 mM 
[uniform 2 H] Tris-Cl, pH 7.1 (pH values were uncorrected for 
isotope effects)/6.7 mM MgCl 2 /150 mM NaCl}; i.e., the 
protein solution was centrifuged at 3,000 rpm for 0.5-2 hr at 
4°C in a Centriprep cartridge (30-kDa cut-off; Amicon), 
followed by dilution with the deuterium buffer. We repeated 
this process several times. The sample was then lyophilized, 
stored at -20°C, and redissolved in D 2 0 (99.96%; Euriso-top) 
before use. The activity of the preparation of RecA protein was 
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assessed by assaying the single-stranded DNA-dependent AT- 
Pase activity, which was not changed by lyophilization. 

Just after the purification, the concentrations of RecA 
protein were first determined by the Folin phenol-reagent 
method described by Lowry et ai (23), with bovine serum 
albumin as a standard. After preparation for NMR spectro- 
scopic observations, as just described, the concentrations were 
determined again by the Bradford method (Bio-Rad), with 
untreated RecA protein as the standard. The amounts of RecA 
protein are expressed as moles of 38-kDa polypeptide. 

NMR Spectroscopy. One- and two-dimensional spectra 
were measured on a Bruker AMX600 spectrometer at 20°C- 
37°C in 20 mM [uniform 2 H] Tris-Cl buffer (pH 7.0) containing 
6.7 mM MgCl 2 and 150 mM NaCl in D 2 0. 

For one-dimensional nuclear Overhauser effect (NOE) 
difference spectra an objective proton was irradiated for 0.5 
sec prior to a 90° read pulse, and the free induction decays were 
subtracted from those of off-irradiated scans. The water signal 
was presaturated for 1,5 sec. The spectra were recorded with 
6,024 Hz of spectra width and 16 k data points. Total mea- 
suring time was 29 min. 

To attenuate undesirable protein resonances in the trans- 
ferred NOE spectroscopy (NOESY) spectrum, we applied a 
short 7i p filter (20 ms, y£, = 3 kHz) before a standard NOESY 
pulse scheme (pre-sat. -90° x - SLy - 90°- x - g - 90° - ti - 
90° - t - g - 90° - Acq.; g, a gradient pulse; ref. 24). The 
mixing time was varied randomly over 8% of the designated 
mixing time to suppress zero-quantum artifacts in transferred 
NOESY spectra (25). The spectra were acquired with a total 
of 1,024 (f 2 ) X 400 (*]) complex points and spectral width of 
6,024 Hz. The free induction decays were apodized with a 
90°-shifted skewed sinebell function (skew parameter 2) before 
Fourier transformation in both dimensions. All two- 
dimensional data sets were processed and analyzed with the 
program FELIX, version 2,3 (Biosym Technologies, San Diego) 
using Silicon Graphics workstations. 

Analysis of NMR Data and Structural Calculation. NOE 
intensities in two-dimensional spectra at mixing times of 100, 
120, 180, or 200 msec were integrated and calibrated using 
those of intra-residue C[H5]-C[H6] crosspeaks as references, 
except in the case of d(TAG). For d(TAG), the intensity of 
H2'-H1' crosspeak was used as a reference. Interproton 
distance of H2'-H1' holds a nearly constant value between 
C2'-endo (2.99 A) and C3 # -endo (2.73 A). All intensities were 
converted to distance restraints using the I/I 0 = (nj/ro)" 6 
relation, where / = peak intensity, r Vi = distance between i and 
j protons, I 0 - peak intensity of the reference and r 0 = distance 
of the reference. All restraints were classified as short (r < 3.5 
A), medium (3.5 A < r < 4.5 A), and long (4.5 A < r) and set 
the upper and lower bound ± 0.3 A, ±0.4 A, and ± 0.5 A, 
respectively. The additional distance restraints appeared in 
longer mixing time (< 200 msec) and repulsive distance 
restraints were incorporated during the structural refinement. 

All structural calculations were carried out by the use of 
x-plor, version 3.1 (26). The refinement protocol followed 
essentially the procedures outlined in the X-PLOR manual. 
After a short cycle of energy minimization, simulated anneal- 
ing calculations were initiated at 1,000 K and run for 18 ps. The 
temperature was then lowered to 100 K with 25 K step size and 
1.5 ps dynamics during each cooling step. The structure was 
then minimized for 200 cycles. Because the sugar puckering 
was ill determined, the obtained structure was further mini- 
mized by use of the revised parameters published by Parkinson 
et ai (27). 

Assignments for all nonexchangeable protons of DNA (ex- 
cept stereospecific assignments of H5 ' and H5") were obtained 
by analysis of double quantum filtered correlated spectroscopy 
(DQF-COSY), total correlated spectroscopy (TOCSY), rotat- 
ing-frame Overhauser effect spectroscopy (ROESY), and 
*H- 31 P correlated spectroscopy ( ! H- 3, P COSY) spectra using 



standard sequential assignment techniques (28). Stereospecific 
assignments of H5' and H5" were carried out during the 
process of structural refinement. 

RESULTS 

Transferred NOE Analysis of Single-Stranded Oligode- 
oxyribonucleotides Bound to RecA Protein. The structure of 
single-stranded DNA induced by the binding to RecA protein 
in the presence of ATPyS was analyzed by means of the 
transferred NOE (TRNOE) using short (3-6-mer) oligode- 
oxyribonucleotides: d(TAG), d(CGA), d(TACG), and d(T- 
GACAT). The TRNOE allows us to analyze structures of small 
ligands bound to large molecules when the exchange between 
bound and free states is fast enough (29-33). 

We added RecA protein stepwise to the solution containing 
oligodeoxyribonucleotides and ATPyS (an unhydrolyzable 
ATP analog). The chemical shifts of the resonances were 
slightly moved and the signals were slightly broadened in 
one-dimensional J H-NMR spectra after the addition of RecA 
protein. We observed no resonances derived from the bound 
state, which should have appeared if the exchange rate were 
slow. Signals from RecA protein were hardly detected due to 
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Fig. 1. One-dimensional TRNOE difference spectra of an oli- 
godeoxyribonucleotide. (a and b) Spectra of 1.1 mM d(TGACAT) and 
54 /xM RecA protein in the presence of 1.1 mM ATPyS (a) or ADP 
(b) in D 2 0 at 37°C. (c) Spectrum of the DNA solution before the 
addition of RecA protein and ATP?S or ADP. (d) DNA was replaced 
by RNA; 1.1 mM r(UGACAU) in the presence of RecA protein and 
ATPyS. The cytosine H5 proton of DNA or RNA was irradiated for 
0.5 sec before a 90° read pulse. 
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severe signal broadening. From T\ p measurements as a func- 
tion of the spin-lock field strength, we have determined the 
dissociation rate constants for the oligodeoxyribonucleotide- 
RecA complex. The value for d(TGACAT) at 30°C was 40,000 
(±4,000) s" 1 , which would be fast enough compared with the 
chemical shift scale and the cross-relaxation rate (T.N. and 
Y.L, unpublished observation). 

ATP is an essential cofactor for RecA protein-mediated 
homologous pairing. ATP is hydrolyzed by RecA protein 
during the reaction and hydrolysis of ATP decreases the 
affinity of RecA protein to DNA. When ATP is replaced by 
ATPyS, RecA protein promotes homologous pairing of single- 
stranded and double-stranded DNA molecules equally well 
(34), and presynaptic filaments formed in the presence of 
ATP-yS under optimum conditions for homologous pairing 
resemble those formed in the presence of ATP (35). We found 
that RecA protein induced TRNOEs of the above oligode- 
oxyribonucleotides in the presence of ATP7S (Fig. la), and 
that the crosspeaks of the transferred NOESY spectra were 
intense and well resolved (see Fig. 2). Intermolecular cross- 
peaks between RecA protein and oligodeoxyribonucleotides 
were not observed, probably because of severely broad signals 
of RecA protein. On the other hand, in the absence of RecA 
protein, little NOEs of oligodeoxyribonucleotides were de- 
tected (Fig. lc), indicating that the TRNOEs depend on 
interactions of the oligodeoxyribonucleotides with RecA pro- 
tein. 

The NOEs Are Caused by Specific Binding of DNA to 
Activated RecA Protein. First, we examined whether the 
observed interactions between DNA and RecA protein had 
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Fig. 2. Two-dimensional transferred NOESY spectra of an oli- 
godeoxyribonucleotide. Two-dimensional transferred NOESY spectra 
of 0.80 mM d(TACG), 97 yM RecA protein, and 0.80 mM ATP-yS at 
180 msec mixing time at 25°C The regions of H8/H6 and H2'/H2" are 
shown in /I, and those of H8/H6 and H3'/H1' in B. 



essential characteristics in common with homologous pairing, 
specifically a requirement for ATP-yS and a preference for 
DNA over RNA. 

Consistent with both the requirement of ATP (or ATP7S) 
for the formation of active presynaptic filaments and the 
reduction of affinity for DNA upon hydrolysis of ATP to ADP, 
TRNOEs of the oligodeoxyribonucleotides induced by the 
addition of RecA protein were significantly reduced when 
ATP7S was replaced by ADP (Fig. 1 b vs. a). 

RecA protein binds to RNA with much less affinity than to 
DNA (36, 37). We have observed that the intensity of TRNOE 
signals was significantly decreased when DNA was replaced by 
RNA with the same sequence except for the replacement of U 
for T (Fig. Id). 

These observations indicate that the NOEs observed here 
are caused by specific interactions of RecA. protein with DNA 
that are the same as those responsible for homologous pairing. 

Transferred NOE Analysis of DNA Bound to RecA Protein. 
The patterns of NOE crosspeaks exhibited by oligodeoxyribo- 
nucleotides tested in this study have common features. Un- 
usually intense interresidue crosspeaks between H3' and 
H8/H6 were observed in transferred NOESY spectra, whereas 
by contrast, few, if any, interresidue crosspeaks between HI/ 
and base protons were detected (Fig. IB). We also observed 
relatively weak sequential H2-H8/H6 and H2"-H8/H6 
NOEs of comparable intensity (Fig. 2A). These are a remark- 
able contrast to those expected for B-form or A-form DNA 
(28). 

Based on these NOE data, we did structural calculations 
applying a simulated annealing protocol by use of x-plor (26). 
The final structure for each oligodeoxy ribonucleotide was well 
defined as shown in Fig. 3, the result for d(TACG) as an 




Fig. 3. Superposition of calculated structures of an oligodeoxyri- 
bonucleotide, d(TACG). One hundred structures were calculated 
independently by the use of simulated annealing protocol (x-plor; ref. 
26). The 10 lowest energy structures are best fitted at the T-A-C region. 
Total number of NOE constraints is 59; 39 for intraresidue NOEs and 
20 for interresidue NOEs. The root-mean-square deviation of the 
T-A-C region is 0.30 A. All residues shows similar deoxyribose-base 
stacking, whereas the fourth residue (G) is disordered because of few 
NOE constraints due to signal overlapping. There is no violation to the 
final structure. 
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example. Thus, we conclude that the obtained structure of 
each ohgodeoxyribonucleotide is the major species that mainly 
contributes to the TRNOE crosspeaks. The calculated struc- 
tures for all the tested oligodeoxyribonucleotides with varia- 
tions in sequence and length have a common substructural 
feature, suggesting that the DNA structure defined in this 
study is not specific to a sequence or to the size of oligode- 
oxyribonucleotides. 

Fig. 44 illustrates a refined molecular model for the struc- 
ture of single-stranded DNA bound to RecA protein in the 
presence of ATPyS. If we assume that a helical axis is 
perpendicular to the base planes as indicated by linear dichro- 
ism (38), the axial rise per base is nearly 5 A (1.5 times that of 
B-form DNA). 

DISCUSSION 

By TRNOE analysis, we have determined a three-dimensional 
structure of single-stranded DNA that has been extended by 
binding to RecA protein in the presence of ATP7S. The most 
prominent feature of the DNA structure is in the manner of 
base stacking. In the normal forms of DNA, adjacent bases are 
stacked by a van der Waals contact. In contrast, in the RecA 



protein-bound form, the 2'-methylene moiety of each deoxy ri- 
bose is located above the base of the next residue in place of 
the normal base-base stacking, and the bases of the single- 
stranded DNA are separated by nearly 5 A (Fig. 4A). This 
spacing agrees well with the 50% extension of single-stranded 
DNA in presynaptic filaments observed by electron micros- 
copy (Fig. 4B; refs. 14 and 39), and the present observations 
reveal the structural basis for that extension. Interactions 
between a methylene moiety and an aromatic ring were 
observed in various biomacromolecules (see ref. 40 for re- 
view). There is, to our knowledge, no prior report of extended 
DNA structures maintained by deoxyribose-base stacking 
through a methylene-base interaction. On the other hand, 
another type of deoxyribose-base interaction is found in 
Z-form DNA (41): the cytidine 04' oxygen is situated above 
the six-membered ring of guanine at d(CpG) steps. 

What is the meaning of this characteristic deoxyribose-base 
stacking in the RecA-induced DNA extension? RecA protein 
has been proposed to bind primarily to the phosphate back- 
bone of single-stranded DNA (42). This type of intermolecular 
interaction probably triggers the extension of single-stranded 
DNA upon polymerization of RecA monomers along the DNA 
backbone. In addition, we propose that the hydrophobic 
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deoxyribose-base stacking interaction stabilizes intramolecu- 
larly the unique DNA conformation. This mechanism presents 
a striking contrast to that of the widely found DNA extension 
upon intermolecular stacking interactions, namely, intercala- 
tion of aromatic moieties of a dye or amino acid residue 
between adjacent bases. In this context, it has been suggested 
that some intramolecular interaction contributes to stabiliza- 
tion of an extended DNA; protein-free DNA molecules, under 
stress from an external force, undergo a highly cooperative 
transition into a stretched structure whose length is 1.7 times 
that of B-form DNA (43, 44) 

Judging from the DNA structure revealed by this study, 
RNA molecules would not form a stable complex with RecA 
protein, because the 2'-hydroxyl group of RNA will repel the 
base and the sugar of the following residue (Fig. 4C). This 
would explain previous and current observations that RNA has 
much less affinity to RecA protein than DNA (Fig. Id). 

DNA has an advantage over RNA as material to hold 
genetic information. A widely accepted reason has been that 
H2" confers chemical stabilization on DNA compared with 
2 -OH of RNA. Our study suggests another role of H2" of 
DNA as genetic material: deoxyribose-base stacking, including 
2'-methyIene moieties of DNA is required for the binding to 
RecA protein and its homologs that are general and pivotal 
machines for homologous recombination. In addition, the 
deoxyribose-base stacking could be intrinsically required for a 
homology search between polynucleotides (see below). These 
newly suggested roles of 2'-methylene moieties might account 
for the low efficiency and fidelity of homologous recombina- 
tion in an RNA virus in contrast to high efficiency and 
accuracy in homologous recombination in organisms with 
DNA genomes (45). 

What is the advantage of the structure stabilized by the 
deoxyribose-base stacking through a methylene-base interac- 
tion? The processes of homologous recognition and strand 
exchange require rotation of bases so as to exchange partners 
in base pairs. As described above, RecA protein appears to 
bind primarily to the phosphate backbone of single-stranded 
DNA and leaves the bases free for homologous pairing (42). In 
DNA stabilized by deoxyribose-base stacking, the rotation of 
adjacent bases is less hindered sterically than in B-form or 
A-form (Fig. 4B). Such freer rotation of bases may favor both 
homologous pairing and strand exchange. 

Another merit of the extended DNA structure is suggested 
by theoretical conformation analysis of triplex DNA molecules 
(R-form DNA), which are supposed to be formed during RecA 
protein-promoted homologous pairing (46-48). According to 
a structural prediction by the theoretical calculations, the bases 
of the third strand in the putative triplex DNA would incline 
and mispair to adjacent base pairs when DNA molecules are 
not extended (49). 

In the presence of ATP or its unhydrolyzable analog, RecA 
protein binds to double-stranded DNA as well and forms a 
helical nucleoprotein filament. Double-stranded DNA in the 
RecA filament has also been found to be extended by 1.5 times 
as compared with B-form DNA, and to be unwound to 18.6 bp 
per turn (16, 17). We suppose that RecA-bound double- 
stranded DNA would be extended by the deoxyribose-base 
stacking as in the case of single-stranded DNA. We made a 
model-building study on double-stranded DNA including the 
deoxyribose-base stacking, and obtained a structure that fits 
the parameters of RecA filaments (T.N., unpublished work). 

Finally, although the structure of a long stretch of DNA in 
presynaptic filaments could be different from those of the 
oligodeoxyribonucleotides, we believe that the three- 
dimensional structure revealed by this study reflects the struc- 
ture of single-stranded DNA in presynaptic filaments for the 
following reasons: (/) the signals from which the structure was 
deduced depend on the presence of both RecA protein and 
ATP7S, («) the structure can be adopted by DNA but not by 
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RNA, (Hi) the pattern of NOE crosspeaks is independent of 
residues in an oligodeoxyribonucleotide and of the sequence 
and length of the tested oligomers, and (iv) the structure agrees 
well with the extension of single-stranded DNA in the fila- 
ments as observed by electron microscopy. 

Thus, the structure of the oligodeoxyribonucleotides deter- 
mined in this study provides a new model that explains how the 
extended form of DNA in the RecA nucleoprotein filament is 
stabilized and that further suggests that the functional signif- 
icance of this form is to facilitate the rotation of bases. 
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Human Rad51 protein (HsRadSl) is a homolog of Escherichia coli RecA 
protein, and functions in DNA repair and recombination. In higher 
eukaryotes, RadSl protein is essential for cell viability. The N-terminal 
region of HsRadSl is highly conserved among eukaryotic RadSl proteins 
but is absent from RecA, suggesting a RadSl-specific function for this 
region. Here, we have determined the structure of the N-terminal part of 
HsRadSl by NMR spectroscopy. The N-terminal region forms a compact 
domain consisting of five short helices, which shares structural similarity 
with a domain of endonuclease III, a DNA repair enzyme of E. coli. 
NMR experiments did not support the involvement of the N-terminal 
domain in HsRadSl -HsBrca2 interaction or the self-association of 
HsRadSl as proposed by previous studies. However, NMR tiration 
experiments demonstrated a physical interaction of the domain with 
DNA, and allowed mapping of the DNA binding surface. Mutation 
analysis showed that the DNA binding surface is essential for double- 
stranded and single-stranded DNA binding of HsRadSl. Our results 
suggest the presence of a DNA binding site on the outside surface of the 
HsRadSl filament and provide a possible explanation for the regulation 
of DNA binding by phosphorylation within the N-terminal domain. 

© 1999 Academic Press 
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Introduction 

The human RadSl protein is a homolog of Escher- 
ichia coli RecA protein and Saccharomyces cerevisiae 
RadSl protein (Shinohara et ah, 1993). S. cerevisiae 
RAD51 gene, along with other members of the 
RAD52 epistasis group of genes including RAD50, 
RAD52, RAD54, RAD55 and RAD57, functions in 
DNA double-strand break repair and genetic 
recombination (Petes et ah, 1991; Resnick, 1987; 



Abbreviations used: HsRadSl, the human RadSl 
protein; NOE, nuclear Overhauser enhancement; 
NOESY, nuclear Overhauser enhancement spectroscopy; 
HSQC, heteronuclear single quantum correlation; 
TOCSY, total correlation spectroscopy; HMQC, 
heteronuclear multiple quantum correlation. 
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Shinohara et ah, 1992). Although the precise cellular 
role of HsRadSl is not fully understood, it is 
believed to be involved in DNA repair and recombi- 
nation (see Baumann & West, 1998; Vispe & Defais, 
1997). HsRadSl has similar biochemical properties 
to RecA and yeast RadSl, i.e. it catalyzes in vitro the 
pairing and exchange of homologous double- 
stranded DNA and single-stranded DNA 
(Baumann et al., 1996; Gupta, et al, 1997; Sung, 
1994). However, the activity of HsRadSl is signifi- 
cantly lower than that of RecA. This suggests a 
requirement for additional factors in RadSl func- 
tioning, and several biochemical studies have 
shown that HsRPA (Baumann et al, 1996; Sung, 
1994), HsRad52 (Benson et al, 1998; Shen et al, 
1996), and HsRad54 (Golub et al, 1997; petukhova 
et al, 1998) stimulate HsRadSl-mediated reactions 
through direct interactions with HsRadSl. Whereas 
yeast cells deficient in RadSl are viable (Shinohara 
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et al, 1992), transgenic mice lacking RadSl die in 
the early stage of the development (Sonoda et al, 
1998), and chicken B-cells stop cell-growth when 
the expression of Rad51 is depressed (Sonoda et al, 
1998). In addition, HsRadSl has been found to 
interact with several rumor suppressors namely p53 
(Buchhop et al, 1997; Sturzbecher et al, 1996), Brcal 
(Scully et al, 1997) and Brca2 (Chen et al, 1998; 
Katagiri et al, 1998; Mizuta et al, 1997; Sharan et al, 
1997; Wong et al, 1997). These findings suggest 
essential roles for RadSl in cell proliferation and 
genome maintenance in higher eukaryotes. 

Alignment of the amino acid sequences of RecA 
and HsRadSl shows that the central domain of 
RecA is homologous to the C-terminal portion 
(approximately two-thirds from the C terminus) of 
HsRad51 (Shinohara et al, 1993). HsRad51 has 
extra sequences on its N terminus side, whereas 
RecA has an extra C-terminal domain which com- 
prises the DNA binding surface (Figure 1; Aihara 
et al, 1997; Kurumizaka et al, 1996). The N-term- 
inal region (amino acid residues 1-95) of HsRadSl 
is well conserved among eukaryotic Rad51 pro- 
teins, but is absent from RecA. This suggests an 
important role for this region in RadSl-specific 
functions such as interactions with other proteins. 
Indeed, yeast two-hybrid analyses showed that the 
N-terminal region of RadSl mediates both RadSl- 
Rad52 interaction and the self-association of Rad51 
in S. cerevisiae (Donovan et al, 1994), and a small 
region near the N terminus (amino acid residues 
1-43) of mouse RadSl protein (MmRadSl) is essen- 
tial for the interaction with MmBrca2 (Sharan et al, 

1997) . The importance of the N-terminal region is 
further supported by the recent finding that c-Abl 
tyrosine kinase regulates HsRadSl function 
through the phosphorylation on Tyr54 (Yuan et al, 

1998) . It may also be possible that the N-terminal 
region of RadSl takes the place of the C-terminal 
domain of RecA, and functions in DNA binding. 

While the tertiary structure of HsRadSl has not 
been clarified, the crystal structure of RecA has 
been determined (Story et al, 1992). An electron 
microscopic study demonstrated that HsRadSl- 
DNA filaments resembled those of RecA (Benson 
et al, 1994; Ogawa et al, 1993). This result, com- 
bined with the considerable degree of sequence 
homology between the C-terminal portion of 
HsRadSl and the core domain of RecA, suggest 
that the structures of these two proteins are very 
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Figure 1. Comparison of HsRadSl and RecA. Amino 
acid sequences of HsRadSl and E. coli RecA are aligned 
as described (Shinohara et al, 1993). Striped bars indi- 
cate the N-terminal d omain of HsRadSl identified in 
this study, and the C-terminal domain of RecA. Shaded 
bars show conserved regions, with black bars showing 
the ATP binding consensus sequences. 



similar within the homologous region. However, an 
amino acid sequence homology search has revealed 
neither the structure nor the function of the N-term- 
inal region of HsRadSl. We anticipated that struc- 
tural information about the N-terminal region 
might provide a clue about the function of 
HsRadSl. 

Here, we describe the structure determination 
and functional analysis of the N-terminal region of 
HsRadSl. It was found that the N-terminal region 
folds into a distinct domain with an all-helical fold, 
and that the domain carries a DNA binding sur- 
face. The results are described in detail below, and 
the possible roles of the N-terminal domain in the 
homologous pairing reaction are discussed. 

Results 

Structure of the N-terminal region of HsRad51 

An N-terminal fragment of HsRadSl containing 
residues 1 to 114 of the full-length protein, which 
was found to be highly soluble and monomelic in 
solution, was used in the NMR study. Residues 1 
to 15 and 86 to 114 are disordered as judged by 
narrower *H NMR linewidths compared with that 
of the structured region of the polypeptide, pre- 
sence of strong HN-H 2 0 crosspeaks in the 15 N-sep- 
arated NOESY spectrum (Marion et al, 1989), and 
the absence of long range nuclear Overhauser 
enhancement (NOE) signals. Thus, the N and C 
terminal parts were not included in the structure 
calculation. The solution structure of the segment 
consisting of residues 16 to 85 was calculated from 
a total of 1388 NMR-derived restraints (Table 1) 
using the simulated annealing protocol with the 
program X-PLOR (Briinger, 1992; Nilges et al, 
1988). The backbone (N, C a , C) superposition for 

Table 1. Statistics for the final ensemble of 30 structures 

A. Root mean square deviations from experimental restraints 
Distance restraints (A) 

All (1321) 0.029 ± 0.001 

Interproton distances 

Intraresidue (285) 0.022 ± 0.003 

Sequential (240) 0.022 ± 0.003 

Short-range (2 ^ |/ - j\ ^ 4) (183) 0.046 ± 0.003 

Long-range (|i - j\ > 5) (103) 0.057 ± 0.003 

Ambiguous (489) 0.012 ± 0.003 

Hydrogen bonds (21) 0.045 ± 0.006 

Dihedral angle restraints (°) (67) 0.46 ± 0.13 

B. Root mean square deviations from idealized geometry 

Bonds (A) 0.00313 ± 0.00013 

Angle (°) 0.64 ± 0.01 

Improper (°) 0.53 ± 0.02 

C. Coordinate precision for residues 24-79 (A) 

Backbone 0.48 ± 0.08 

Heavy-atoms l .06 ± 0.01 

D. PROCHECK* Ramachandran map analysis (all structures) (%) 
Most favoured regions 57.1 
Additional allowed regions 33.7 
Generously allowed regions 6.7 
Disallowed regions 2.5 

* Laskowski et al. (1996). 
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Figure 2. Structure of the N-terminal domain of 
HsRadSl. Stereoview showing the backbone (N, C", C) 
atoms of 30 superimposed NMR-derived structures for 
residues 19-83. This Figure was generated using the pro- 
gram MIDASPIus (Ferrin et al., 1988). 



an ensemble of the final 30 structures is presented 
in Figure 2. An alternate minor backbone confor- 
mation was indicated for the region including resi- 
dues Gly21-Pro22-Gln23 and Val49-Glu50-Ala51 / 
where two sets of signals were observed in the 
1 H- 15 N heteronuclear single quantum correlation 
(HSQC) spectrum (Bodenhausen & Ruben, 1980; 
Grzesiek & Bax, 1993). Higher mobility may also 
be present within residues Ala53-Tyr54, whose 
crosspeaks were missing in the 1 H- 15 N HSQC spec- 
trum. We noted that despite the fact that the 
domain carries an overall negative charge, a cluster 
of Lys residues makes a positively charged patch 
on the protein surface (Figure 3). 

Protein interactions 

We first investigated whether the N-terminal 
domain is involved in the protein-protein inter- 
actions as suggested by previous studies including 
those on RecA (Donovan et al., 1994; Sharan et al, 
1997; Story et al, 1992). Two polypeptides were 
tested for their capacity to interact with 
HsRad51(l-114): a fragment of HsBrca2 that con- 
tains residues 3273 to 3309 and the full-length 
HsRadSl itself. HsBrca2(3273-3309) is 95% identi- 
cal with MmBrca2(3196-3232), which was ident- 
ified as the minimal region of MmBrca2 needed for 



the interaction with the N-terminal region of 
MmRadSl by the yeast two-hybrid analysis 
(Sharan et al, 1997). 15 N-labeled HsRad51(l-114) 
was titrated with each of the polypeptides, and the 
interactions were monitored by measuring a series 
of *H- 15 N HSQC spectra. However, irrespective of 
the protein used, the spectra did not change 
throughout the titration. Therefore the NMR exper- 
iments do not support the involvement of the N- 
terminal domain in the interaction with HsBrca2 
(3273-3309), or in the self-association of HsRadSl. 
We also found by NMR experiment and GST-pull- 
down analysis that the N-terminal domain of 
HsRadSl does not bind HsRad52 neither (H.K., 
unpublished results). 

Interaction with DNA 

To test another possible function, we next exam- 
ined if the N-terminal domain of HsRadSl interacts 
with DNA using chemical shift perturbation exper- 
iments. The titration of a 12 bp double-stranded 
DNA into the NMR sample of ,5 N-labeled 
HsRadSl (1-1 14) caused shifting and broadening of 
selected crosspeaks in the ! H- N HSQC spectrum 
(Figure 4(a)), while the chemical shifts of the 
remaining residues (including those in the unstruc- 
tured regions) were only slightly affected or not 
affected at all. This result indicates a direct inter- 
action between HsRadSl (1-1 14) and DNA. Shifting 
of the crosspeaks was also observed in a similar 
titration using 12mer single-stranded DNA, though 
the chemical shift changes were small. The mode 
of the chemical shift change indicates that the bind- 
ing behavior is fast exchange on the NMR time- 
scale, which did not allow us the structure 
determination of the protein-DNA complex. The 
dissociation constant (K d ) values were estimated to 
be 0.31 mM and 0.89 mM in the double-stranded 
and single-stranded DNA binding, respectively 
(Figure 5). The resonances affected were those of 
the backbone amides of Ile61, Lys64, Gly65, Ile66, 
Ala69, and the neighboring residues on the protein 
surface (Figure 4(b) and (c)). This indicates that the 
surface encompassed by these residues furnishes 
the binding site for DNA. The region overlaps with 



a b 




Figure 3. Electrostatic surface 
potential calculated using the pro- 
gram GRASP (Nicholls & Honig, 
1992). Positive potential is colored 
blue and negative potential is 
colored red. (a) Same orientation as 
shown in Figure 2. (b) Viewed after 
180° rotation around the vertical 
axis. 
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Figure 4. Chemical shift pertur- 
bation upon the DNA binding, (a) 
Expansions of : H- 15 N HSQC spectra 
of 15 N-labeled HsRadSl (1-114) in 
the absence (black contours) and 
presence (red contours) of a three 
molar equivalent of 12 bp double- 
stranded DNA. The crosspeaks that 
shift upon the addition of DNA 
are indicated, (b), (c) Chemical 
shift change of backbone *H and 15 N 
calculated as [(AS 1 *!) 2 * (A5 15 N) 2 ] 1/2 
(Hz) is color-coded and mapped 
onto the (b) molecular surface or 
the (c) ribbon drawing of 
HsRad51(19-83). (b) and (c) Drawn 
using programs GRASP (Nicholls & 
Honig, 1992) and MIDASPlus 
(Ferrin et al, 1988), respectively. 



the positively charged patch (Figure 3) mentioned 
before. 

Interestingly, a search of the Brookhaven Protein 
Data Bank with the program Dali (Holm & Sander, 
1993) showed that the structure of the N-terminal 
domain of HsRadSl is similar to that of the six- 

200 1 ■ — — ■ . — , 



helix barrel domain of endonuclease III, a DNA 
repair enzyme of E. coli (Thayer et al, 1995; 
Figure 6). The six-helix barrel domain of E. coli 
endonuclease III is known to be involved in DNA 
binding, indicating that the structural similarity 
between the N-terminal domain of HsRadSl and 
the six-helix barrel domain is functionally relevant. 





0.0 0.5 1.0 1.5 2.0 



DNA concentration (mM) 

Figure 5. Chemical shift change of backbone and 
15 N of Gly65 in the titration with 12 bp double-stranded 
DNA (•) or 12mer single-stranded DNA (■), calcu- 
lated as [(A8 l H) 2 + (A5*N) 2 P /2 (Hz). The continuous 
line is the best fit of the data to the equation described 
in Materials and Methods. The concentration of 
HsRad51(l-114) was 0.2 mM. 



Figure 6. Structural similarity of the N-terminal 
domain of HsRadSl and the six-helix barrel domain of 
£. coli endonuclease III. Stereodiagram showing the 
backbone superposition of HsRadSl (red, residues 26- 
84) and endonuclease III (cyan, residues 31-99) (Thayer 
et al, 1995). The r.m.s.d. along the C™ atoms of residues 
26-29, 32-35, 41-45, 46-49, 50-53, 55-65 and 67-84 of 
HsRadSl with the corresponding part of endonuclease 
III is 2.86 A. The Figure was generated using the 
program MIDASPlus (Ferrin et al, 1988). 
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However, there may be no evolutionary relation- 
ship between these two proteins, since the N-term- 
inal domain of HsRadSl lacks a counterpart for the 
helix-hairpin-helix DNA binding motif present in 
the six-helix barrel domain of the endonuclease III. 

Mutation analyses 

To confirm the functional significance of the 
DNA binding surface identified by the NMR 
experiment, we made mutants of HsRad51(l-114) 
and full-length HsRadSl. In order to diminish the 
positive charge and perturb the local conformation 
around the DNA binding surface, Lys64 was 
replaced by a Gly residue (K64G) in both proteins. 
We first compared the *H- 15 N HSQC spectrum of 
HsRadSl (1-1 14)- K64G with that of the wild-type 



fragment. A significant difference in the backbone 
*H and 15 N chemical shift was found for residues 
within or in the neighborhood of the loop region 
between Asn62 and Ser67. This was in contrast to 
the chemical shift of the other residues which did 
not change or changed only very little (Figure 7(a)). 
This indicates that the mutation perturbed the 
local conformation around the DNA binding 
surface. Chemical shift perturbation upon the 
addition of DNA was little for HsRad51(l- 
114)K64G (Figure 7(b)), suggesting that the DNA 
binding was diminished by the mutation. 

We then examined the DNA binding property of 
the full-length mutant HsRadSl (K64G) by the gel 
mobility shift assay. Wild-type HsRadSl makes 
complexes with single-stranded DNA and double- 






8.5 8.0 
1 H (ppm) 

Figure 7. (a) Chemical shift difference between the wild- type and K64G. Differences in the backbone chemical shift 
calculated as [(A&H) 2 + (A6 15 N) 2 ] 1/2 are color-coded. The Figure was drawn using the program MIDASPlus (Ferrin 
et al, 1988). (b) Expansions of 1 H- 15 N HSQC spectra of 15 N-labeled HsRad51(l-114) -K64G in the absence (black con- 
tours) and presence (red contours) of a three molar equivalent of 12 bp double-stranded DNA. Same region as that 
shown in Figure 4(a), 
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stranded DNA as shown by the reduced mobility 
of these DNAs through polyacrylamide or agarose 
gels (Figure 8). HsRadSl ■ K64G showed decreased 
single-stranded DNA and double-stranded DNA 
binding activity compared to the wild-type protein. 
These results indicate that the DNA binding sur- 
face within the N-terminal domain plays an 
important role in the DNA binding of HsRadSl. 

Discussion 

In the absence of significant amino acid sequence 
homology to other proteins, we determined the 
three-dimensional structure of the N-terminal 
domain of HsRadSl in order to elucidate its func- 
tion. NMR experiments and a mutation analysis 
revealed that the N-terminal domain is involved in 
DNA binding. 

Like its bacterial homolog RecA, HsRadSl cata- 
lyzes the pairing of single-stranded DNA and 
double-stranded DNA sharing homologous 
sequences. It also catalyzes the following strand 
exchange (Baumann et al, 1996; Baumann & West, 
1997). Both RecA and Rad51 bind to single- 
stranded DNA to form well-conserved right- 
handed helical filaments, containing single- 
stranded DNA along their axes (Benson et al, 1994; 
Ogawa et al, 1993). In the case of RecA, the pairing 



reaction starts with the formation of a RecA-single- 
stranded DNA filament, which then incorporates 
double-stranded DNA (Kahn & Radding, 1984; 
Shibata et al, 1979; West et al, 1980). In the present 
study we identified a DNA binding surface within 
the N-terminal domain of HsRadSl. This finding 
has implications for the function of the N-terminal 
domain in the homologous pairing reaction. As 
shown in Figure 1, HsRadSl has an extra segment 
on its N terminus side situated outside of the hom- 
ologous region, while RecA has an extra C-term- 
inal domain (Shinohara et al, 1993). In the case of 
RecA, the C-terminal domain is located on the out- 
side surface of the RecA filament and projects 
into the helical groove in a pendulous manner 
(Story et al, 1992; Yu et al, 1998), while the N- 
terminal part is located on the other side of the 
groove (Story et al, 1992; Figure 9). In the three- 
dimensional reconstruction of the electron micro- 
graphs of Rad51-DNA filament, the N-terminal 
domain cannot be visualized due to its flexibility 
(Ogawa et al, 1993). Assuming similarity between 
the RadSl structure and the RecA crystal structure 
in the homologous region, the N-terminal domain 
of RadSl may also be on the outside surface of the 
RadSl-single-stranded DNA filament and protrud- 
ing into the helical groove, but from the opposite 
pole of the protein structure. In our previous bio- 
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Figure 8. Binding of wild-type 
and the mutant (K64G) of full- 
length HsRadSl to double- and 
single-stranded DNA. (a) Gel mobi- 
lity shift assay showing the binding 
of wild-type HsRadSl (lanes 2-5) 
and HsRadSl K64 G (lanes 6-9) to 
double-stranded DNA. Protein con- 
centrations were 0 (lane 1), 0.6 |iM 
(lanes 2 and 6), 1.2 |iM (lanes 3 and 
7), 2.4 |iM (lanes 4 and 8), and 
4.8 uM (lanes 5 and 9). (b) Gel 
mobility shift assay showing the 
binding of wild-type HsRadSl 
(lanes 2-5) and HsRadSl -K64G 
(lanes 6-9) to single-stranded DNA. 
Protein concentrations were 0 (lane 
1), 0.3 \xM (lanes 2,6), 0.6 uM (lanes 
3,7), 0.9 \iM (lanes 4,8), and 1.2 mM 
(lanes 5,9). 




Figure 9. Van der Waals surface of the 12 RecA 
monomers in the crystal structure (Story et al, 1992). 
The N-terminal part is shown in cyan and the C-term- 
inal domain in red. The Figure was drawn with MIDAS- 
Plus (Ferrin et al, 1988). 



chemical and NMR analyses we suggested that the 
DNA binding surface in the C-terminal domain of 
RecA facilitates the spooling of double-stranded 
DNA into the helical groove of the RecA-single- 
stranded DNA filament (Aihara et al, 1997; 
Kurumizaka et al, 1996). Considering the spatial 
relationship, the N-terminal domain of HsRadSl 
could have the same function as the C-terminal 
domain of RecA, though these two domains share 
no sequence or structural homology. 

From our mutation analysis, it seems likely that 
the N-terminal DNA binding surface is also 
important for the binding of HsRadSl to single- 
stranded DNA. It has been reported that the c-Abl 
tyrosine kinase inhibits the single-stranded DNA 
binding of HsRadSl through the phosphorylation 
on Tyr54 (Yuan et al, 1998). In accordance with 
this observation, the aromatic side-chain of Tyr54 
is not buried inside the molecule (Figure 4(c)). The 
inhibition of DNA binding by the phosphorylation 
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could be an effect of either electrostatic repulsion 
or perturbation of the N-terminal domain structure 
in the DNA binding site identified by this study. 

In conclusion, we found that the N-terminal part 
of the human RadSl protein constitutes a DNA 
binding domain. The domain may lie on the out- 
side surface of the RadSl filament, suggesting the 
critical role of this domain in the homologous pair- 
ing reaction. 

Materials and Methods 

Sample preparation 

HsRad51(l-114) was expressed in E. coli strain 
JM109(DE3) using the pET3a vector (Novagen). tRNA ar s 3 
and tRNA arg4 were coexpressed to facilitate the trans- 
lation of the minor codons. Uniformly 13 C/ 15 N and 15 N- 
labeled proteins were prepared by growing the bacteria 
on minimal medium containing * 5 NH 4 C1 either with or 
without 13 Q-glucose. The protein was purified from cell- 
free extract by successive DEAE-Sepharose Fast Flow 
(Pharmacia) and Sephadex G50 Superfine (Pharmacia) 
column chromatography. Mass spectrometry suggested 
that the N terminus of the protein is acefylated after 
cleavage of the first methionine residue. Typical NMR 
samples for structure determination contained 1 mM 
protein, 20 mM sodium phosphate (pH 6.5), 100 mM 
NaCl, 2 mM DTT and 0.02% NaN 3 in *Y1 2 0/ 2 H 2 0 (9:1) 
or 2 H 2 0. 

Preparation of wild-type and the mutant form of full- 
length HsRadSl will be described elsewhere (H.K. et al, 
unpublished results). HsBrca2(3273-3309) was expressed 
as a glutathione S-transferase fusion and purified by affi- 
nity chromatography, followed by cleavage with throm- 
bin and cation exchange chromatography. 

Structure determination 

All NMR spectra were acquired at 30 °C on a Bru- 
ker DRX600 or ARX400 spectrometer. The 1 H / 13 C and 
15 N resonances of the backbone were assigned using 
3D CBCA(CO)NNH (Grzesiek & Bax, 1992), HNCACB 
(Wittekine & Mueller, 1993) and 2D 'H- 15 N HSQC 
(Bodenhausen & Ruben, 1980; Grzesiek & Bax, 1993) 
experiments. The side-chain signals were assigned 
using 3D HCCH-TOCSY (Bax et al, 1990), 
H(CCCO)NNH (Grzesiek et al, 1993; Clowes et al, 
1993), C(CCO)NNH (Grzesiek et al, 1993; Clowes et al, 
1993), 15 N-separated TOCSY (Marion et al, 1989), 
HNHB (Archer et al, 1991), and 2D ^-^C HSQC 
experiments (Bodenhausen & Ruben, 1980). Complete 
chemical shift assignments have been deposited in the 
BioMagResBank, with the accession number 4328. Dis- 
tance restraints were obtained from 3D 15 N- or 13 C- 
separated NOESY (Marion et al, 1989) and 2D 'H^H 
NOESY (Kumar et al, 1980) (recorded in 2 H 2 0) spec- 
tra with mixing times of 150 ms. NOEs were classified 
into four distance ranges, 1.8 to 3.2, 1.8 to 3.8, 1.8 to 
5.5, and 1.8 to 6.0 A according to the peak intensities. 
An additional 0,5 A was added to the upper limits 
for distances involving methyl protons. NOE cross- 
peaks that cannot be assigned unambiguously were 
included as ambiguous distance restraints (Nilges et al, 
1997), restraining the r~ 6 sum of the distances between 
contributing atoms. The 4> angle restraints were 
obtained from the 3 / H n,hcx coupling constants measured 
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in HMQC-J (Kay & Bax, 1990) experiment. Hydrogen 
bond distance restraints and v(/ angle restraints were 
employed for a-helical regions based on the 3 / H n,h* 
coupling constants and chemical shift index (Wishart 
& Sykes, 1994). Structures were calculated with the 
random simulated annealing protocol using the pro- 
gram X-PLOR 3.851 (Briinger, 1992; Nilges et al, 
1988). The level of ambiguity in the ambiguous dis- 
tance restraints was reduced during the structure cal- 
culation by discarding potential assignments that gave 
distances >8 A in the calculated structures with lower 
target function values. This cut-off was set to 0.5 A 
above the upper limit of each distance restraint in the 
latter stages of the structure calculation. The final 
structure calculation employed 1300 inter-proton dis- 
tance restraints, 40 (|> angle restraints, 21 hydrogen 
bond distance restraints and 27 \|/ angle restraints. Of 
1 00 structures calculated, 70 structures had no 
restraint violations above 0.5 A or 5°. The 30 struc- 
tures with lowest energy were used to represent the 
solution structure of HsRad51(l-114). 

Titration experiments 

DNA used in the titration experiment was a 12 bp 
d(CCGGTGATAGAC)/(GTCTATCACCGG) oligonucleo- 
tide, the central ten base-pairs of which was reported to 
adopt the B-type conformation in solution (Baleja et al, 
1990). When single-stranded DNA was used, the bottom 
strand was employed. The DNAs were added to 0.1 or 
0.2 mM solution of 15 N-labeled HsRad51(l-114), and a 
series of 1 H- 15 N HSQC spectra were recorded at various 
single-stranded or double-stranded DNA concentrations 
ranging from 0.05 mM to 1.75 mM. Dissociation constant 
(K d ) values were calculated by fitting the experimental 
data (chemical shift change of Gly65 upon DNA binding) 
to the equation: 

*d = 0P 0 ] - [PD])([D 0 ] " [PD])/[PD] 

[PD] = [P 0 ] x A obs /A max 

where [P 0 ], [D 0 ], and [PD] are the concentrations of 
HsRad51(l-114) (total), DNA (total), and the protein- 
DNA complex, respectively; A obs is the difference 
between the observed chemical shift and the chemical 
shift of the free state; and A max is the chemical shift 
difference between the free and the bound states. K d and 
A m ax were treated as fitting parameters during the curve 
fitting. In the titration with the full-length HsRadSI and 
HsBrca2(3273-3309), the unlabeled peptides were added 
to 0.1 mM solution of 15 N- labeled HsRad51(l-114) to 
give the final concentration of 0.1 mM and 0.2 mM, 
respectively. All titration experiments were performed at 
293 K, and the solution conditions were same as that in 
the measurements for structure determination. 



DNA binding assays 

Linearized pGsat4 double-stranded DNA (18 |iM, 
3216 bp) was incubated with wild-type or the mutant 
HsRadSI in the 10(il of reaction buffer containing 50 mM 
Hepes-KOH (pH 7.5), 2 mM ATP, 20 mM creatine phos- 
phate, 1 mM DTT, 100 (ig/ml bovine serum albumin, 12 
units/ ml creatine phosphokinase, 15 mM MgCl 2/ and 
3% (v/v) glycerol. Samples were analyzed by electro- 
phoresis through 0.8 % agarose gel in 0.5 x TBE buffer. 
DNA and DNA-protein complexes were visualized by 



ethidium bromide staining. In the single-stranded DNA 
binding assay, 32 P-labeled single-stranded oligonucleo- 
tide 50 mer (300 nM) was incubated with HsRadSI in 
the same reaction buffer used in the double-stranded 
DNA binding assay, except that glycerol content was 6 % 
(v/v). Samples were analyzed by electrophoresis 
through non-denaturing 12% polyacrylamide gel in 
TBE buffer. Bands were analyzed by BAS-2500 image 
analyzer (Fuji). 

Protein Data Bank accession numbers 

The structure and restraint files have been deposited 
in the Brookhaven Protein Databank with the accession 
codes lb22 and rlb22mr, respectively. 
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! !AA_MULTIPLE_ALIGNMENT 1.0 PileUp GCG 
Attorney Docket: 1107 
Serial No. : 09/537, 654 



Symbol comparison table: genrundata :blosum62 . cmp CompCheck: 1102 

GapWeight: 8 GapLengthWeight : 2 

AF034955 Mouse Rad51D protein encoded by AF034955 

AF034956 Human Rad5lD protein encoded by AF034956 

D10023 s. cerevisiae Rad51 protein encoded by D10023 

X64270 S. cerevisiae Rad51 protein encoded by X64270 

U22441 Tomato Rad51 protein encoded by U22441 

U43652 A. thaliani Rad51 protein encoded by U43652 

D14134 Human RadSl protein encoded by D14134 

NM079844 D. melanogaster Rad51 protein encoded by NM_079844 

1107sid2 SEQ ID NO: 2 encoded by SEQ ID NO: 1 

1107sid6 SEQ ID NO: 6 

1107sid4 SEQ ID NO: 4 

ac002387 A. thaliani Rad51C protein encoded by AC002387 At2g45280 

AF029669 Human Rad5lC protein encoded by AF029669 

U84138 Human RadSIB protein encoded by U8413 8 

FORMATTING: Amino acid residue ^^Hfl, relative to SEQ ID NO: 2 
Amino acid residue |^(HB(conserved substitution) 
SEQ ID NO: 2 protein encoded by elected sequence 

1 50 

AF034955aa 

AF034956aa 

D10023aa MSQVQEQHIS ESQLQYGNGS LMSTVPADLS QSWDGNGNG SSEDIEATNG 

X64270aa MSQVQEQHIS ESQLQYGNGS LMSTVPADLS QSWDGNGNG SSEDIEATNG 

U22441aa 

AtU43652aa MTTMEQR 

D14134aa _ 

NM079844aa I 

1107sid2 ~ Z 

1107sid6 ZI ZZZZZZI 

1107sid4 ~~ 

ac002387pep ~Z 

AF029669aa ZZ 

U84138a.a 



51 



AF034955aa 

AF034956aa 
D10023aa 
X64270aa 
U22441aa 

AtU43652aa 
D14134aa 

NM079844aa 
1107sid2 
1107sid6 
1107sid4 
ac002387pep 

AF029669aa 
U84138aa 



SGDGGGLQEQ 
SGDGGGLQEQ 
HRNQKSMQDQ 
. RNQNAVQQQ 
MAMQMQ 



M GMLRAGLCPG 

M GVLRVGLCPG 

AEAQGEMEDE AYDEAALGSF VPIEKLQVNG 
AEAQGEMEDE AYDEAALGSF VPIEKLQVNG 

ND EIEDVQHGPF . PVEQLQASG 

DD . . EETQHGPF . PVEQLQAAG 

LEANA...DT SVEEESFGPQ . PISRLEQCG 
MEKLTNVQAQ QEEEEEEGP. LSVTKLIGGS 



100 

LTEETVQLLR 
LTEEMIQLLR 
ITMADVKKLR 
ITMADVKKLR 
IAALDVKKLK 
IASVDVKKLR 
INANDVKKLE 
ITAKDIKLLQ 



-MISFGRRKS PAIEETSLAT 

MRGKTFRFEM QRDLVSFPLS PAVRVKLVSA GFQTAEELLE VKPSELSKEV 
MGSK KLKRVGLSQE LCDRLSRHQI LTCQDFLCLS 



AF034955aa 

AF034956aa 
D10023aa 
X64270aa 
U22441aa 

AtU43652aa 
D14134aa 

NM079844aa 
1107sid2 
1107sid6 
1107sid4 
ac002387pep 

AF029669aa 
U84138aa 



AF034955aa 

AF034956aa 
D10023aa 
X64270aa 
U22441aa 

AtU43652aa 
D14134aa 

NM079844aa 
1107sid2 
1107sid6 
1107sid4 
ac002387pep 

AF029669aa 
U84138aa 



101 

GRKIKTVADL 
SHRIKTWDL 
ESGLHTAEAV 
ESGLHTAEAV 
DAGLCTVESV 
DAGLCTVEGV 
EAGFHTVEAV 
QASLHTVESV 



AAADLEEVAQ 
VSADLEEVAQ 
AYAPRKDLLE 
AYAPRKDLLE 
VYAPRKELLQ 
AYTPRKDLLQ 
AYAPKKELIN 
ANATKKQLMA 



kcgl|yk|lv 

KCGL YK . . . 



KG I 
KG I 
KG I 
KG I 
KG I 



150 

alrrvllaq| s|fpl|ga8| 



EAKAD 
EAKAD 
EAKVD 
DAKVD 
EAKAD 



SVMEAWRLPL 
GISKAEALET 
PLELMKVTGL 



SPSIRGKLIS 
LQIIRRECLT 
SYRGVHELLC 



pglgggkve 

gdq| 
gdq| 

AGYTCLSBl - 
NKPRYAGpS . 
§VSR§CAI 



kllneaarlv p] 
kllneaarlv p] 
kiieaasklv p! 
kiveaasklv f 
kilaeaaklv p] 
qiiteaIklv p: 
. . .ngpIq 
.ngpIq: 
.ngp|q: 

.ASVSSSD 
. ESHKKC . . . 

. . . m|ta| gi 




151 



|lkt|tai 



hmr.rselic 
hmr.rselic 
h|q.r|eiiq 
h|q.rqeiiq 

hqr.rseiiq 

I.RADWQ 
LEQ. 





|i| fcsa|d| 
b(tlsa|d| 



201 



AF034955aa 

AF034956aa 
D10023aa 
X64270aa 
U22441aa 

AtU43652aa 
D14134aa 

NM079844aa 
1107sid2 
1107sid6 
1107sid4 
ac002387pep 

AF029669aa 
U84138aa 



AF034955aa 

AF034956aa 
D10023aa 
X64270aa 
U22441aa 

AtU43652aa 
D14134aa 

NM079844aa 
1107sid2 
1107sid6 
1107sid4 
ac002387pep 

AF029669aa 
U84138aa 



AF034955aa 

AF034956aa 
D10023aa 
X64270aa 
U22441aa 

AtU43652aa 
D14134aa 

NM079844aa 
1107sid2 
1107sid6 
1107sid4 
ac002387pep 

AF029669aa 
U84138aa 




251 




CSS 

cssl 

CHFi 

jehrkaled ft _ 
.ryfn|ee|l |lts 



301 




WV APLLGGQQRE 
WV SPLLGGQQRE 

nSG. .RG 
SG . . RG 
|SG . . RG 
ISG. .RG 
|SG . . RG 

r. .RG 



300 



G.TIAQQ EATSSG7 
.G^TVAQQ VTGSSG1 
)AAAQ 
IAAAQ 
iEAAS 
ILEAAS 
QASA 
G 




350 



.QLQG 




351 



AF034955aa 

AF034956aa 
D10023aa 
X64270aa 
U22441aa 

AtU43652aa 
D14134aa 

NM079844aa 
1107sid2 
1107sid6 
1107sid4 
ac002387pep 

AF029669aa 
U84138aa 



AF034955aa 

AF034956aa 
D10023aa 
X64270aa 
U22441aa 

AtU43652aa 
D14134aa 

NM079844aa 
1107sid2 
1107sid6 
1107sid4 
ac002387pep 

AF029669aa 
U84138aa 



AF034955aa 

AF034956aa 
D10023aa 
X64270aa 
U22441aa 

AtU43652aa 
D14134aa 

NM079844aa 
1107sid2 
1107sid6 
1107sid4 
ac002387pep 

AF029669aa 
U84138aa 




'rd|dgr r.fkpalgrs wsfvpstril ldvtegIgB 

RDRDSG R.LKPALGRS WSFVPSTRIL Ld|iEg|g|s 

^QVDGG MAFNPDPKKP IG 

QVDGG MAFNPDPKKP IG. 
bvDGS AVFAGPQIKP IG . 
QVDGS ALFAGPQFKP IG, 
Iqvdga AMFAADPKKP IG. 
SLDGA PGMF.DAKKP IG. 



400 



d rnqa: 

LSGA LASQADLVSP ADDLSLSEG| s| 




401 
. SPRQP 1 
, SSRQP 1 

Igfkkg: 
Igfkkg: 



|l QEMiq gtlg t: 
gtwg ts| 




451 



472 



^V^M 

CSLQTEGSLS TRKRSRDPEE EL 




450 
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□ 1: AF034955. Mus musculus Rad5... Related Sequences, OMIM, Protein, PubMed, Taxonomy, UniSTS, 

[gi:2920579] LinkOut 

LOCUS AF034955 1699 bp mRNA linear ROD 29-APR-1998 

DEFINITION Mus musculus RadSld mRNA, complete cds . 
ACCESS ION AF034 955 

VERSION AF034955 . 1 GI : 2920579 

KEYWORDS 

SOURCE house mouse. 

ORGANISM Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 
REFERENCE 1 (bases 1 to 1699) 

AUTHORS Pittman,D.L. , Weinberg, L . R . and Schimenti, J. C . 

TITLE Identification, characterization, and genetic mapping of RadSld, a 

new mouse and human RAD51/RecA-related gene 
JOURNAL Genomics 49 (1), 103-111 (1998) 
MEDLINE 98234549 
REFERENCE 2 (bases 1 to 1699) 

AUTHORS Pittman,D.L. , Weinberg, L . R . and Schimenti, J. C. 
TITLE Direct Submission 

JOURNAL Submitted (18-NOV-1997) The Jackson Laboratory, 600 Main Street, 
Bar Harbor, ME 04609, USA 
FEATURES Location/Qualifiers 
source 1 . . 1699 

/organism="Mus musculus" 
/ db_xr e f = " t axon : 1 0 0 9 0 " 
gene 1 . . 1699 

/gene=" RadSld" 
CDS 196.. 1185 

/gene= "RadSld" 
/codon_start=l 
/product = " RAD5 ID » 
/protein id=" AAC40093 . 1 " 
/db_xref ="GI :2920580" 

/translation^' MGMLRAGLCPGLTEETVQLLRGRKI KTVADLAAADLEEVAQKCG 
LSYKALVALRRVLLAQFSAFPLNGADLYEELKTSTAILSTGIGSLDKLLDAGLYTGEV 
TEIVGGPGSGKTQVCLCVAANVAHSLQQNVLYVDSNGGMTASRLLQLLQARTQDEEKQ 
ASALQRIQWRSFDIFRMLDMLQDLRGTIAQQEATSSGAVKWIVDSVTAWAPLLGG 
QQREGLALMMQLARELKILARDLGVAVWTNHLTRDWDGRRFKPALGRSWSFVPSTRI 
LLD VTEGAGTLGS S QRTVCLTKS PRQPTGLQEM ID I GTLGTEEQS PEL PGKQT " 

BASE COUNT 391 a 456 c 480 g 372 t 

ORIGIN 

1 attcggcacg aggcttgcga gttgggaagg gttagtgtcc ctcacctcga cttgcatcct 
61 cttcccgccc cttcgtcccg cgctggcacg ccaggacctt ttccctaagt agaactgagg 
121 aatgcccaga gtgggggact cggcgagcgc ccaagtgaca gagagccccc agggcatcct 
181 gggtttgtag ggactatggg catgctcagg gcagggctgt gcccgggcct caccgaggag 
241 accgtccagc ttctcagagg ccgaaagata aaaacagtgg cagacctggc agctgctgac 
301 ttggaggaag tagcccagaa gtgtggcttg tcctacaagg ccctcgttgc cctgaggagg 
361 gtgttgctgg cgcagttctc ggctttcccc ttaaatggcg cagatctcta tgaggaactg 
421 aagacttcca cggccatcct gtccaccggc atcggaagcc tggacaaact acttgatgct 
481 ggcctctata ctggggaggt gactgaaatt gtgggtggcc caggtagcgg caaaacccag 



http://vww.ncbi.nlm.nih^ 02/25/2002 
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541 gtgtgtctct gtgtgg<^^c aaatgtggcc catagcctgc agcagal^t actgtatgtg 

601 gattccaatg gaggaatgac ggcgtcccgc ctcctccagc tactacaggc tagaacccaa 

661 gatgaggaga aacaggcaag tgctctccag aggatacagg tggtgcgttc atttgacatc 

721 ttccggatgc tagatatgct acaggacctt cgcggcacca tagcccagca ggaagcaact 

781 tcttcaggcg ccgtgaaggt tgtgattgtg gactcggtca ctgcagtggt cgccccactt 
841 ctgggaggtc agcagaggga aggcctggcc ttgatgatgc agctggcccg agagctcaag 

901 atcctggccc gggacctggg tgtggcagtg gtggtgacca accacttgac tcgagattgg 

961 gatggtagaa gattcaaacc tgcccttgga cgctcctgga gctttgtgcc cagtacccgg 

1021 attctcctgg atgtcactga gggggctggg acactcggta gcagccaacg cacagtatgt 

1081 ctgaccaagt ctccccgcca gccaacgggt ctgcaggaga tgatagacat tgggacattg 

1141 gggactgagg agcagagccc agaattacct ggcaagcaga cgtgacactg ttgattggga 

1201 aagggacagc ctcaaggccc ccacagcttt ccttcccggg cagctactgt cccctgccac 

1261 atgggaccta gcagtgcaga ggtctcaggc tcatcagaag acaagcctca gtcccctcag 

1321 cttcattgtg tgagatctca tggagcccct ggccagggga tgctgcagtg agagagacct 

1381 agaagtccat cagggtctca agttttcttc ctttgtgtct accgaatgtt tccagtgtgg 

1441 gttgaggtca acctagcttg aggcctcact gaagaaacac agtagtacct gcatgaatag 

1501 tgtcctggtc ctgtgtgtta ccattggttc ctaaacccct ccctcacaca gtaattgggc 

1561 ctttgctcac ccattccccc ccccccattt attttggtct gttttcccat ctgcaagatg 

1621 gcagtttctc agtctgcaag ttattcagta gaaacaacct cagaaatgtc aaaaaaaaaa 



// 



1681 aaaaaaaaaa aaaaaaaaa 
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□ 1: AF034956. Homo sapiens RAD5...[gi:2920581] Related Sequences, OMIM, Protein, PubMed, Taxonomy, LinkOut 



LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
MEDLINE 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

FEATURES 

source 



linear PRI 29-APR-1998 



gene 



CDS 



AF034956 1598 bp mRNA 

Homo sapiens RAD51D mRNA, complete cds . 
AF034956 

AF034 956 .1 GI : 292 0581 



human . 

Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebra ta ; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 (bases 1 to 1598) 

Pittman,D.L. , Weinberg, L . R . and Schimenti , J . C . 

Identification, characterization, and genetic mapping of Rad51d, a 
new mouse and human RAD51/RecA- related gene 
Genomics 49 (1), 103-111 (1998) 
98234549 

2 (bases 1 to 1598) 

Pittman,D.L. , Weinberg, L .R . and Schimenti , J . C . 
Direct Submission 

Submitted (18-NOV-1997) The Jackson Laboratory, 600 Main Street, 
Bar Harbor, ME 04609, USA 

Location/Qualifiers 

1 . . 1598 

/organism="Homo sapiens" 
/db_xref = " taxon : 9606 " 
1 . .1598 

/gene= n RAD51D" 
125 . .994 
/gene="RAD51D" 
/codon_start=l 
/product ="RAD5 ID" 
/protein_id= " AAC39719 . 1 " 
/db_xref="GI :2920582" 

/translation "MGVLRVGLCPGLTEEMIQLLRSHRIKTWDLVSADLEEVAQKCG 
LSYKXLDKLLDAGLYTGEVTEIVGGPGSGKTQVCLCMAANVAHGLQQNVLYVDSNGGL 
TASRLLQLLQAKTQDEEEQAEALRRIQWHAFDIFQMLDVLQELRGTVAQQVTGSSGT 
VKVVVVDSVTAWSPLLGGQQREGLALMMQLARELKTLARDLGMAVVVTNHITRDRDS 
GRLKPALGRSWSFVPSTRILLDTIEGAGASGGRRMACLAKSSRQPTGFQEMVDIGTWG 
TSEQSATLQGDQT " 

402 c 455 g 375 t 2 others 



364 a 



BASE COUNT 
ORIGIN 

1 attcggcacg agcgcgcctg tgtcctctct aggaaggggt aggggagggg cgtctggaga 

61 ggaccccccg cgaatgccca cgtgacgtgc agtccccctg gggctgttcc ggcctgcggg 

121 gaacatgggc gtgctcaggg tcggactgtg ccctggcctt accgaggaga tgatccagct 

181 tctcaggagc cacaggatca agacagtggt ggacctggtt tctgcagacc tggaagaggt 

241 agctcagaaa tgtggcttgt cttacaagtn ncttgataaa ctgcttgatg ctggtctcta 

301 tactggagaa gtgactgaaa ttgtaggagg cccaggtagc ggcaaaactc aggtatgtct 

361 ctgtatggca gcaaatgtgg cccatggcct gcagcaaaac gtcctatatg tagattccaa 

421 tggagggctg acagcttccc gcctcctcca gctgcttcag gctaaaaccc aggatgagga 

481 ggaacaggca gaagctctcc ggaggatcca ggtggtgcat gcatttgaca tcttccagat 

541 gctggatgtg ctgcaggagc tccgaggcac tgtggcccag caggtgactg gttcttcagg 
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601 aactgtgaag gtggtg^g tggactcggt cactgcggtg gtttcc^Bc ttctgggagg 
661 tcagcagagg gaaggcttgg ccttgatgat gcagctggcc cgagagctga agaccctggc 
721 ccgggacctt ggcatggcag tggtggtgac caaccacata actcgagaca gggacagcgg 
781 gaggctcaaa cctgccctcg gacgctcctg gagctttgtg cccagcactc ggattctcct 
841 ggacaccatc gagggagcag gagcatcagg cggccggcgc atggcgtgtc tggccaaatc 
901 ttcccgacag ccaacaggtt tccaggagat ggtagacatt gggacctggg ggacctcaga 
961 gcagagtgcc acattacagg gtgatcagac atgacctgtg ctgttgtttg ggaaacaggg 
1021 aagcattggg gacccctccc aacttttctt cccagtaacg cctgctgttt actgccacct 
1081 ggcactggtg actacagacg ttctcaggct ggccagaaga gacatcttgg gttccttggc 
1141 ctcactctct gtaagcatat aaaccacagg cgaaagagga tgctgcattg cgaggaccca 
1201 gaaattcata ctggtgccac gtttccttcc cttatttcta acgtgtatgt ttctggtgga 
1261 aaccaagttc accctggctg ggagcatctc tgatgaggca tgctggcgac tggatggata 
1321 atcctgtgca tcaccattgt gtcctgtgct ccctcctagc gcagtggcca agccgggaaa 
1381 gcctctaact tgcctttgct gctgctgcct tttttttctt ttgtctctgc ctttccattt 
1441 gttagatggg ggcccactct tccttagctc tgtctctgag ttactgggtg gaaataagct 
1501 tataaatgaa atactcttct tcatctctgt tttgctctta aaaatataaa aaggcaattc 
1561 cccgaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaa 

// 
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□ 1: D10023. S.cerevisiae Rad5...[gi:218468] 



Related Sequences, Protein, PubMed, Taxonomy 



LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
MEDLINE 
REMARK 



linear PLN 02-FEB-1999 



COMMENT 



FEATURES 

source 



misc signal 
repeat region 



misc signal 

TATA signal 
gene 

CDS 



YSCRAD51 3724 bp DNA 

S.cerevisiae Rad51 protein gene. 
D10023 

D10023 . 1 GI :218468 
Rad51 protein. 

Saccharomyces cerevisiae DNA. 
Saccharomyces cerevisiae 

Eukaryota; Fungi; Ascomycota; Saccharomycotina; Saccharomycetes ; 
Saccharomycetales ; Saccharomycetaceae ; Saccharomyces . 

1 (bases 1 to 3724) 
Shinohara, A. 
Direct Submission 

Submitted (29-NOV-1991) Akira Shinohara, Faculty of Science, Osaka 
University, Department of Biology; Toyonaka, Osaka 560, Japan 
(E-mail : c6252 8@center . osaka-u . ac . jp , Tel : 06-844-1151 (ex.4305) , 
Fax:06-841-2449) 

2 (bases 1 to 3724) 

Shinohara , A . , Ogawa , H . and Ogawa , T . 

Rad51 protein involved in repair and recombination in S. cerevisiae 
is a RecA-like protein 
Cell 69 (3), 457-470 (1992) 
92257587 

Erratum: [ [published erratum appears in Cell 1992 Oct 

2;71 (1) :following 180]] 

Submitted (29-NOV-1991) to DDBJ by: 

Akira Shinohara 

Department of Biology 

Faculty of Science 

Osaka University 

Toyonaka, Osaka 5 60 

Japan 

Phone: 06-844-1151 x4305 
Fax: 06-841-244 . 

Location/Qualifiers 

1. .3724 

/organi sm= "Saccharomyces cerevisiae " 
/db_xref = " taxon : 4 932 " 
187 . .204 
/note="Box B" 
350. .370 

/ rpt_type=direct 
/rpt_unit=350 . . 359 
/rpt_unit=361 . . 370 
506. .542 
/note="Box A" 
569. .572 
645 . . 1847 
/gene="RAD51" 
645. .1847 
/gene="RAD51 M 
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terminator 
CDS 



/ codon^^ar t - 1 

/product="Rad51 protein" 
/protein id=" BAA00913 . 1 " 
/db_xref="GI :218469" 

/ 1 r an s 1 a t i on= " MS QVQEQH I S E S QLQ YGNGS LMS TVPADLS QS WDGNGNGS S ED 

I EATNGSGDGGGLQEQAE AQGEMEDE AYDE AALGS FVP I EKLQVNG I TMADVKKLRE S 

GLHTAEAVAYAPRKDLLE I KG I S EAKADKLLNE AARL VPMGFVTAADFHMRRSEL I CL 

TTGSKNLDTLLGGGVETGSITELFGEFRTGKSQLCHTLAVTCQIPLDIGGGEGKCLYI 

DTEGTFRPVRLVSIAQRFGLDPDDALNNVAYARAYNADHQLRLLDAAAQMMSESRFSL 

IVVDSVMALYRTDFSGRGELSARQMHLAKFMRALQRLADQFGVAVVVTNQVVAQVDGG 

MAFNPDPKKPIGGNIMAHSSTTRLGFKKGKGCQRLCKVVDSPCLPEAECVFAIYEDGV 

GDPREEDE " 

1927. .1932 

3030. .>3724 

/note="unknown gene product" 

/codon_start=l 

/protein id=" BAA20966 . 1 " 

/db_xref ="GI :4433385" 

/ 1 r an s 1 a t i on = " MWPMFTL ARKTRRAC YQDGKPLKCTGE WKKTQD I YSNFQ YAQ Y 
ILRVGLDTEKLHELVKELEDESNSFSVDSLKEYLVNDAKVILKRLSAVGYPDAQYLLG 
DAYSSGVFGKIKNRRAFLLFSAAAKRMHIESVYRTAICYECGLGVTRNAPKAVNFLTF 
AATKNHPAAMYKLGVYSYHGLMGLPDDILTKMDGYRWLRRATSMASSFVCGAPFELAN 
IYMTGYKDLIISDP" 

757 c 806 g 1140 t 



Page 2 of 3 



cacatgacta 
tggacggtaa 



BASE COUNT 1021 a 

ORIGIN Chromosome V. 

1 ggatccgaca tttttttttt atgctttatt cactgttcaa 
61 agaaacgcac tctacttcga aactacggtt caaacttact 
121 tggcctttct actatgccat aaactctttc ttcctctctt 
181 acttttttgc caccggcagt gccatccggt 
241 ggcttatcat tgtcacagag taaattaaaa 
301 ccgttcttca accaatctag tttagctatc 
tgagcattcc aaccggttgt atcagtgttt 
gttaccctat gctacgcgtc 
ggtgggacca taaaggggaa 
tatcttccgt agtttccata 
atttgttaaa ggcctactaa 
atcagagtca cagcttcagt 



361 
421 
481 
541 
601 
661 
721 
781 
841 
901 



gccacacttc 
cagtacgcgt 
ttacttcttc 
agacgtagtt 
aacaacatat 
cagcagacct 
aggccaccaa 
aaatggagga 
tgcaagtgaa 
961 ctgctgaagc 
1021 aagctaaggc 
cggctgctga 
atttggacac 



tattttcacc acaattcgca 
tagcagcttc ccgatttaat 
ttcatcgccc ctgcatttgc 
caccacgtta atagcgatct 
atgttggaaa tgcaccacta 
ctgcaacagg tggccttctt gagcattccc 
tatcaccgtc tcaccatatc ccacgactag 
atttccgcta tttctgtcct ggtttgttta 
tagtggggac tggagaaaaa attttctcag 
tactagtagt tgagtgtagc gacaaagagc 
tttgttatcg tcatatgtct caagttcaag 



acgggaacgg ttcgttgatg tccactgtac 

ttcacagtca gtcgttgatg gaaacggcaa cggtagcagc gaagatattg 

cggctccggc gatggtggcg gattgcagga gcaagcggaa gcgcaaggtg 

tgaagcatac gatgaagctg ccttaggttc gtttgtgcca atagaaaaac 

cgggattact atggcggatg tgaaaaaact 

ggtagcatat gctcccagaa aggatttatt 

agataagttg ctaaacgaag cggcaaggct 

ttttcatatg agaagatcgg agctgatttg 

tcttttgggt ggtggtgtgg aaactggttc 

aattcaggac aggtaagtcc cagctatgtc acactttggc cgtgacatgc caaattccat 

tggatattgg tggcggtgaa ggtaagtgtt tgtatatcga taccgaaggt actttcaggc 

ggtatccata gctcagcggt tcggattaga cccggatgat gctttgaaca 

tgcaagagcc tataacgccg atcatcagtt aagacttctg gatgctgctg 

gagcgagtct cggttttcct tgattgtggt cgattctgtt atggctctat 



1081 
1141 
1201 
1261 

1321 cggtaagatt 
1381 acgttgcgta 
1441 cccaaatgat 
1501 accgtacgga 
1561 ttatgcgtgc 
1621 aagtggtcgc 
1681 gtggtaatat 
1741 gtcaaagatt 
1801 cgatctatga 
1861 tgtctctatt 
1921 cgtacatttt 
1981 tacctcgcaa 
2041 attgtctgtc 
2101 tcctgcaggg 
2161 tgtcgccatc 
2221 ttccttgtct 



aagggagagt gggcttcaca 
ggaaatcaaa ggtatatcgg 
agtgcctatg ggatttgtca 
tttgacaacg ggttctaaga 
tattactgag cttttcggtg 



tttttctggt cgtggtgaac 
tttgcaaagg ctggccgacc 
ccaagttgat ggtggtatgg 
tatggcacat tcttccacca 
atgcaaagtt gttgactcac 



taagcgcaag gcaaatgcat ttagccaaat 
aatttggtgt tgcagtcgtc gttactaacc 
cttttaatcc agatccaaag aagcctatcg 
cgcgattagg tttcaaaaag ggtaagggat 
cttgcttacc agaggctgaa tgtgtgttcg 



agatggtgtt ggtgacccca gagaagaaga cgagtaggta tttggtctct 



tatttacaca ggtttacttt 
tatcttcatt tccatccact 
ccctactgcg gtcttaacct 
acccatgaaa tatatgtatt 
tccgcgcggc tttatccttt 



caattctcct ctttttctta ggttgcgttc 
gtcttagatt tttgcatata ttttgtcata 
tttttttcag ttcttttaaa taactttcgt 
tttctactct tcttcccgat gactacttcc 
taggggagtg aagagaaaaa ttttctgata 
ctcttataaa accaacgtaa aataccatac acttctttat ggcaatgaat 
tcgtttcgac gtttttaaat tgaataaaaa attgtacttg gaaaaaagaa 
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2281 tagaacaata cttttgcraa cagcaagaga agtacggttg acatta^^g tatccctttg 
2341 ttaaaatcag gcagtaccca taatgtccat gactatttgc tcaaatactc ctggtgcata 
2401 ccctgaaatt ggagcttata atgagggtga taacagttgg aatcaagcgg attctcccct 
2461 gattcctctc tgatattgaa taaaccagaa gttcgtcaat actggtcatc cgtttcatct 
2521 catatttcac gctcaggtga tgtgtttaca aatgataaag aaaaaatttc atcctctatt 
2581 ggtgaggatg cgatggatat cgacgcttcg ccctcattga ttgaaaaaat aactcttttc 
2641 ccacaagaaa gattcttcct gagcaggacg agtttgagaa cgacgttgag gacgatgctt 
2701 cttcctcact gaaggaaaaa tcacaaggtt cttgtgaaat tgaaattgct tctgaaattt 
2761 catccgaaat tttaaatggc acatcggcag acggtaattc cgagtttcac gacttcgctg 
2821 agccccctcc ctctcagaat gaatctgtcg ctctgtcttt tagtcaatcg aatgatttgg 
2881 acttttcgaa taatccaagc ggatcaggct cttcaacgat atcaatagaa gtacaagttc 
2941 catttcactt cctagacatg tgtcactaga ttttaacgtt tacaatagtc tttgcctcac 
3001 aaatgaggtg actgcatcag aatcacataa tgtggccaaa tttcaccttg gcaaggaaaa 
3 061 caagaagagc ttgctaccaa gatggaaaac cattgaaatg taccggggag gtagttaaga 
3121 aaactcaaga catatattct aattttcagt atgcacaata cattttaagg gtagggttgg 
3181 ataccgaaaa actacacgaa cttgttaagg aactggagga tgaaagcaat agcttctctg 
3241 tagattcttt aaaggaatat ttagtgaacg atgccaaagt tattttgaag agactaagtg 
3301 ccgttggata ccctgatgca cagtaccttc taggtgacgc ctactcctct ggagttttcg 
3361 gtaaaataaa gaatagaaga gcatttcttt tattctccgc tgcagccaaa agaatgcaca 
3421 tcgaaagtgt ttacagaact gccatttgct atgagtgcgg cttgggtgta accagaaatg 
3481 caccgaaggc ggttaacttt ttgacttttg ctgcaactaa gaaccatcct gcagcaatgt 
3541 acaaattagg agtatattcg tatcatggtt tgatgggtct tccagatgat attctaacca 
3601 aaatggatgg ttatagatgg ttgcgaaggg ctacatctat ggctagcagc tttgtttgtg 
3661 gtgctccttt tgaattagcc aatatttata tgacgggata taaagatctt atcatttcgg 
3721 atcc 

// 
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F 1: X64270. S.cerevisiae RAD5...[gi:4274] 



Related Sequences, Protein, PubMed, Taxonomy 



linear PLN 03-DEC-1993 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

REFERENCE 
AUTHORS 
TITLE 



JOURNAL 
MEDLINE 
FEATURES 

source 



LOCUS SCRADDNA 2173 bp DNA 

DEFINITION S.cerevisiae RAD51 gene. 
ACCESSION X64270 S38937 
VERSION X64270.1 GI:4274 

KEYWORDS DNA repair; RAD51 gene; recombination and repair. 
SOURCE baker's yeast. 

ORGANISM Sac char omyces cerevisiae 

Eukaryota; Fungi; Ascomycota; Saccharomycotina; Saccharomycetes ; 
Saccharomycetales ; Saccharomycetaceae ; Saccharomyces . 

1 (bases 1 to 2173) 
Fabre , F . 

Direct Submission 

Submitted (22 - JAN-1992 ) F. Fabre, Institut Curie, Section de 
Biologie, Centre Universitaire, Batiment 110, 914 05 Orsay, FRANCE 

2 (bases 1 to 2173) 

Aboussekhra, A. , Chanet,R., Adjiri,A. and Fabre, F. 
Semidominant suppressors of Srs2 helicase mutations of 
Saccharomyces cerevisiae map in the RAD51 gene, whose sequence 
predicts a protein with similarities to procaryotic RecA proteins 
Mol. Cell. Biol. 12 (7), 3224-3234 (1992) 
92318940 

Location/Qualif iers 
1. .2173 

/ organism^" Saccharomyces cerevisiae" 
/strain="S288C" 
/sub_s train= "GRF88 " 
/ db_xr e f = " t axon : 4 9 3 2 " 
/chromosome="VR" 
/map="80cM" 
misc signal 444.^49 

/note-"DNA synthesis control sequence" 
misc signal 485.. 490 

/note="DNA synthesis control sequence" 
gene 645. .1847 

/gene="RAD51" 
CDS 645.. 1847 

/gene="RAD51" 

/function=" repair of DNA double strand breaks" 
/note="similarities to procaryotic RecA" 
/codon_start=l 
/protein id="C AA45563 . 1 " 
/db_xref ="GI :4275" 
/ db_xref = " SWISS -PROT : P25454 " 

/translation= "MSQVQEQHISESQLQYGNGSLMSTVPADLSQSWDGNGNGSSED 
IEATNGSGDGGGLQEQAEAQGEMEDEAYDEAALGSFVPIEKLQVNGITMADVKKLRES 
GLHTAEAVAYAPRKDLLE I KG I SEAKADKLLNEAARLVPMGFVTAADFHMRRS EL I CL 
TTGS KNLDTLLGGGVETGS I TELFGEFRTGKS QLCHTLAVTCQ I PLD IGGGEGKCL Y I 
DTEGTFRPVRLVSIAQRFGLDPDDALNNVAYARAYNADHQLRLLDAAAQMMSESRFSL 
IvVDSVMALYRTDFSGRGELSARQMHLAKFMRALQRLADQFGVAVWTNQVVAQVDGG 
MAFNPDPKKPIGGNIMAHSSTTRLGFKKGKGCQRLCKWDSPCLPEAECVFAIYEDGV 
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terminator 



541 



BASE COUNT 
ORIGIN 

1 ggatccgaca 
61 agaaacgcac 
121 tggcctttct 
181 acttttttgc 
241 ggcttatcat 
301 ccgttcttca 
361 tgagcattcc 
421 gccacacttc 
481 cagtacgcgt 
541 ttacttcttc 
601 agacgtagtt 
661 aacaacatat 
721 cagcagacct 
781 aggccaccaa 
841 aaatggagga 
901 tgcaagtgaa 
961 ctgctgaagc 
1021 aagctaaggc 
1081 cggctgctga 
1141 atttggacac 
1201 aattcaggac 
1261 tggatattgg 
1321 cggtaagatt 
1381 acgttgcgta 
1441 cccaaatgat 
1501 accgtacgga 
1561 ttatgcgtgc 
1621 aagtggtcgc 
1681 gtggtaatat 
1741 gtcaaagatt 
1801 cgatctatga 
1861 tgtctctatt 
1921 cgtacatttt 
1981 tacctcgcaa 
2041 attgtctgtc 
2101 tcctgcaggg 
2161 tgtcgccatc 

// 



gdpreed! 

2063 . .2067 
a 461 c 

tttttttttt 
tctacttcga 
actatgccat 
caccggcagt 
tgtcacagag 
accaatctag 
aaccggttgt 
gttaccctat 
99tgggacca 
tatcttccgt 
atttgttaaa 
atcagagtca 
ttcacagtca 
cggctccggc 
tgaagcatac 
cgggattact 
ggtagcatat 
agataagttg 
ttttcatatg 
tcttttgggt 
aggtaagtcc 
tggcggtgaa 
ggtatccata 
tgcaagagcc 
gagcgagtct 
tttttctggt 
tttgcaaagg 
ccaagttgat 
tatggcacat 
atgcaaagtt 
agatggtgtt 
tatttacaca 
tatcttcatt 
ccctactgcg 
acccatgaaa 
tccgcgcggc 
etc 



496 g 

atgetttatt 
aactaeggtt 
aaactctttc 
gccatccggt 
taaattaaaa 
tttagctatc 
atcagtgttt 
gctacgcgtc 
taaaggggaa 
agtttccata 
ggcctactaa 
cagcttcagt 
gtcgttgatg 
gatggtggcg 
gatgaagctg 
atggcggatg 
gctcccagaa 
ctaaacgaag 
agaagategg 
ggtggtgtgg 
cagctatgtc 
ggtaagtgtt 
getcageggt 
tataacgecg 
eggttttect 
cgtggtgaac 
ctggccgacc 
ggtggtatgg 
tcttccacca 
gttgactcac 
ggtgacccca 
ggtttacttt 
tccatccact 
gtcttaacct 
tatatgtatt 
tttatccttt 



675 t 

cactgttcaa 
caaacttact 
ttcctctctt 
cacatgacta 
tggacggtaa 
ctgeaacagg 
tatcaccgtc 
atttcegcta 
tagtggggac 
tactagtagt 
tttgttatcg 
acrgggaaegg 
gaaacggcaa 
gattgeagga 
ccttaggttc 
tgaaaaaact 
aggatttatt 
eggcaagget 
agctgatttg 
aaactggttc 
acactttggc 
tgtatatcga 
teggattaga 
atcatcagtt 
tgattgtggt 
taagcgcaag 
aatttggtgt 
cttttaatcc 
cgegattagg 
cttgcttacc 
gagaagaaga 
caattctcct 
gtcttagatt 
tttttttcag 
tttctactct 
taggggagtg 



tattttcacc 
tagcagcttc 
ttcatcgccc 
caccacgtta 
atgttggaaa 
tggecttett 
tcaccatatc 
tttctgtcct 
tggagaaaaa 
tgagtgtagc 
tcatatgtct 
ttcgttgatg 
eggtagcage 
geaageggaa 
gtttgtgcca 
aa 999 a 9 a 9t 
ggaaatcaaa 
agtgcctatg 
tttgacaacg 
tattactgag 
cgtgacatgc 
taccgaaggt 
cceggatgat 
aagacttctg 
cgattctgtt 
geaaatgeat 
tgcagtcgtc 
agatccaaag 
tttcaaaaag 
a gaggctgaa 
cgagtaggta 
ctttttctta 
tttgeatata 
ttcttttaaa 
tcttcccgat 
aagagaaaaa 



acaattcgea 
ccgatttaat 
ctgcatttgc 
atagegatet 
tgcaccacta 
gagcattccc 
ccacgactag 
ggtttgttta 
attttctcag 
gacaaagagc 
caagttcaag 
tccactgtac 
gaagatattg 
gcgcaaggtg 
atagaaaaac 
gggcttcaca 
ggtatategg 
ggatttgtca 
ggttctaaga 
etttteggtg 
caaattccat 
actttcaggc 
gctttgaaca 
gatgetgetg 
atggctctat 
ttagccaaat 
gttactaacc 
aagectateg 
ggtaagggat 
tgtgtgttcg 
tttggtctct 
ggttgcgttc 
ttttgtcata 
taactttcgt 
gactacttcc 
ttttctgata 
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r 1: U22441. Lycopersicon escu...[gi: 1143809] 



Protein, PubMed, Taxonomy, LinkOut 



LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 

TITLE 



JOURNAL 
MEDLINE 
REFERENCE 
AUTHORS 

TITLE 
JOURNAL 

FEATURES 

source 



LEU22441 1241 bp mRNA linear PLN 18-JUN-1998 

Lycopersicon esculentum LeRADSl (RAD51) mRNA, complete cds . 
U22441 

U22441.1 GI:1143809 
tomato . 

Lycopersicon esculentum 

Eukaryota; Viridiplantae ; Streptophyta; Embryophyta; Tracheophyta; 

Spermatophyta; Magnoliophyta; eudicotyledons / core eudicots; 

Asteridae; euasterids I; Solanales; Solanaceae; Solanum; 

Lycopersicon. 

1 (bases 1 to 1241) 

Yeager Stassen,N. , Logsdon,J.M. Jr., Vora,G.J., Of f enberg, H.H. , 
Palmer, J. D. and Zolan,M.E. 

Isolation and characterization of rad51 orthologs from Coprinus 
cinereus and Lycopersicon esculentum, and phylogenetic analysis of 
eukaryotic recA homologs 
Curr. Genet. 31 (2), 144-157 (1997) 

97174112 

1 to 1241) 

Logsdon,J.M. Jr., Vora,G.J., Of f enberg, H. H. , 



2 (bases 
Yeager Stassen,N. 
Palmer, J. D. and Zolan,M.E. 
Direct Submission 

Submitted (09-MAR-1995) Biology, Indiana University, Bloomington, 
IN 47405, USA 

Location/ Qualifiers 
1. .1241 

/organism= "Lycopersicon esculentum" 

/ db_xr e f = " t axon : 4 0 8 1 11 

/ 1 i s sue_type= " anther " 
gene 1 . . 1241 

/gene="RAD51" 
CDS 98. .1126 

/gene= M RAD51" 

/note="similar to human RAD51 protein: PIR Accession 

Number S37673 and to RecA-like protein, encoded by Genbank 

Accession Number Z24756" 

/ codon_start=l 

/product = "LeRADSl" 

/protein_id= " AAC23700 . 1 " 

/db_xref ="GI : 1143810" 

/ trans la tion= "MEQQHRNQKSMQDQNDEIEDVQHGPFPVEQLQASGIAALDVKKL 
KDAGLCTVES WYAPRKELLQI KGI SEAKVDKI I EAAS KLVPLGFTS ASQLHAQRLE I 
IQITSGSKELDKILEGGIETGSITEIYGEFRCGKTQLCHTLCVTCQLPLDQGGGEGKA 
MYIDAEGTFRPQRLLQIADRYGLNGPDVLENVAYARAYNTDHQSRLLLEAASMMVETR 
FALMIVDSATALYRTDFSGRGELSARQMHLAKFLRSLQKLADEFGVAWITNQWAQV 
DGSAVFAGPQIKPIGGNIMAHASTTRLALRKGRAEERICKWSSPCLAEAEARFQISV 
EGVTDVKD" 
pol yA site 1241 

/gene= n RAD51" 
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/note='W A nucleotides" 
BASE COUNT 361 a 227 c- 302 q 351 t 

ORIGIN 

1 gaattcggca cgagattttt ggcggctcct cacgcctttg aaatatttcc ttttgcctcc 
61 attcttccat tctcagtgag catcaagaaa attagcaatg gagcagcagc acaggaatca 
121 gaagtcgatg caagaccaaa atgatgaaat cgaggatgtt caacacggcc cttttccagt 
181 tgaacaactt caggcatcag ggattgcagc tctagatgta aaaaaactca aggatgctgg 
241 tctatgtaca gttgaatctg ttgtttatgc tccaagaaag gaacttctgc agataaaagg 
301 aattagtgaa gctaaagttg acaagattat tgaggcagct tcaaaattag tgcctttggg 
361 attcactagt gccagccaac tccatgcaca gaggcttgaa atcatacaga taacttctgg 
421 atcgaaagaa cttgacaaga tattagaagg aggaatcgaa actggatcta ttactgaaat 
481 ttacggagag ttccgatgtg gaaagactca gctgtgtcac acactatgcg tgacttgtca 
541 acttccatta gatcagggag gtggtgaagg gaaagcaatg tacattgatg ctgagggtac 
601 tttcagacca caaagacttt tacaaattgc agacaggtat ggattgaatg gtcctgatgt 
661 cctggagaat gtagcctatg ctcgagctta taataccgat catcaatcaa gacttttgct 
721 tgaggcagcc tcaatgatgg tggagaccag gtttgctctc atgattgtgg acagtgctac 
781 tgccctttat agaactgact tctctgggag aggagagttg tctgccaggc agatgcatct 
841 tgcaaagttt ctgagaagcc ttcagaagtt agcagatgag tttggtgttg ctgttgttat 
901 tacgaaccaa gttgttgctc aagtggatgg ttctgctgta tttgctgggc ctcaaataaa 
961 acccattggt ggcaacatca tggcacatgc ttctacgacg agactagctc tgaggaaggg 
1021 tagggccgag gagcggattt gtaaagtagt cagttcgcca tgcttagctg aagcagaagc 
1081 aagatttcaa atttctgttg aaggagtcac tgatgtaaag gactaaatgt gtgatcagca 
1141 cattgtttac tactagtact attttttgtt tcctacttgg tgtacgattt tgtcatcgtt 
1201 ttgaaggtta gttaaccata aaaagaagta tgctatatgg a 
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F 1: U43652. Arabidopsis thali...[gi: 1706948] 



Related Sequences, Protein, PubMed, Taxonomy, LinkOut 



LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
MEDLINE 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

FEATURES 

source 



tRNA 



tRNA 



tRNA 



mRNA 



gene 
CDS 



N. Smith, Department of Biology, 



ATU43652 4868 bp DNA linear PLN 04-DEC-1996 

Arabidopsis thaliana RAD51 homolog (AtRadSl) gene, complete cds, 
and tRNA-Cys gene, complete sequence. 
U43652 

U43652.1 GI:1706948 

thale cress strain=Landsberg erecta. 
Arabidopsis thaliana 

Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta; 
Spermatophyta; Magnoliophyta; eudi cotyledons ; core eudicots; 
Rosidae; eurosids II; Brassicales; Brassicaceae; Arabidopsis. 

1 (bases 1 to 4868) 

Urban, C, Smith, K.N. and Beier,H. 

Nucleotide sequences of nuclear tRNA(Cys) genes from Nicotiana and 
Arabidopsis and expression in HeLa cell extract 
Plant Mol. Biol. 32 (3), 549-552 (1996) 
97134945 

2 (bases 1 to 4868) 

Smith, K.N., Shinohara,A. and Signer, E.R. 
Direct Submission 

Submitted (19-DEC-1995) Kathleen 
MIT, Cambridge, MA 0213 9, USA 

Location/Qualifiers 

1. .4868 

/organi sm= " Arabidops i s thai iana " 
/strain="Landsberg erecta" 
/db_xref ="taxon:3702 " 
/ chr omo s ome = " 5 " 
/map=" linked TSL and mi 13 8" 
197. .268 

/note= "designated pAtCl" 
/product = " tRNA-Cys " 
603 . .674 

/note= "designated pAtC2" 
/product =" tRNA-Cys " 
1016. .1087 

/note= "designated pAtC3 " 
/pseudo 

join (1708. .1853,2258. 
2921. .3015, 3101. .3214, 
/gene= "AtRadSl" 
/note="RAD51 homolog" 
/ product = "At RAD 5 1 " 
1708 . .3958 
/gene= "AtRadSl" 
join(1758. .1853, 
2921. .3015, 3101, 
/gene= "AtRadSl" 

/note="RAD51 homolog; the translational start site has not 



.24 02,2513. .2623,2698. .2789, 
r 3303. .34 32,3520. .3641,3737. .3958) 



, 2258. 
. .3214, 



2402 
3303 



,2513 . .2623,2698. .2789, 

. .3432, 352 0. .3 641,3737. .3 860) 



been determined, there are two in- frame Methionine 
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^ residu^Rn the predicted protein sequel" 

J /codon_start=l 
' /product="AtRAD51" 

/ protein_id= " AAC49555 . 1 " 

/db_xref ="GI : 1706949" 

/ translation= "MTTMEQRRNQNAVQQQDDEETQHGPFPVEQLQAAGIASVDVKKL 

RDAGLCTVEGVAYTPRKDLLQIKGISDAKVDKIVEAASKLVPLGFTSASQLHAQRQEI 

IQITSGSRELDKVLEGGIETGSITELYGEFRSGKTQLCHTLCVTCQLPMDQGGGEGKA 

MYIDAEGTFRPQRLLQIADRFGLNGADVLENVAYARAYNTDHQSRLLLEAASMMIETR 

FALLIVDSATALYRTDFSGRGELSARQMHLAKFLRSLQKLADEFGVAWITNQWAQV 

DGSALFAGPQFKPIGGNIMAHATTTRLALRKGRAEERICKVISSPCLPEAEARFQIST 
EGVTDCKD " 

BASE COUNT 1469 a 745 c 879 g 1775 t 

ORIGIN 

1 tttgtatata attctttagc aagtgaatat gtttttcttt ataatttgaa ggtttaattt 
61 gtttgtgaaa attgtttttg ataatttata tgattactaa ataagtaaac aattgacttg 
121 cttatattag atttcttagc aaaaaaacaa ttaatgaaat aaacaattta ttattttgaa 
181 cttattaaag caataaaggt ccatagctca gtggtagagc aattgactgc agatcaatag 
241 gtcaccggtt cgaacccggt tgggccctat attttttcag tttaccaata atttttaata 
301 agagctttag ttatatattt gattagcatt tagcgtcaag tagttgacta atcttatatt 
361 tcaaggtttt agttagtaat ttttcatctt gaataaaata aattttgaca agtttttgta 
421 tataattctt gagcaactga aaatattttt ttaaaatttc atggtttagt ttgtttgtta 
481 tttgttttat gaaataatta tgtgatttct aaataagtaa ataattgact atattagatt 
541 ttttagcaaa aaaacaattg acgaaataaa taatttacaa tttgaactta ttaaagcaat 
601 aagggtccat agctcagtgg tagagcaatt gactgcagat caataggtca ccggttcgaa - 
661 cccggttggg ccctatattt ttcagtttac caatattttt tataagagct ttattaatat 
721 atttgattaa catttagcgt ctactagttg actaatttta tattataagg ttttttctcg 
781 caatttttca tcttgaataa aaacattttg acaagttttt taatataatt cattagcaac 
841 tgaaaatgtt ttttttttca tttcatggtt taatttgttt gtgaaaattg tttttgataa 
901 tttatatgat tactaaataa gtaaataact gacttggata tattagattt cttagcaaaa 
961 aataaaataa ttaatagata aataatttat gattttgaac ttattaaagt aataagggtt 
1021 cataactcag tggtagagca attgactgca gatcaatagg tcaccggttt gaacccggtt 
1081 aggccctata ttttttagtt taccaaaaaa atttaaatat catcttgaat aaagaaaatt 
1141 gacaaatttt gtgatatttg taatatttta tttttattat aataagtgat tactacattg 
1201 ttggaattgt ggtggttctc ggcggtcaaa cacctaggta ccatttggtt gacattcaaa 
1261 cacctaggta ccactcggcg gtcaaacacc tattgttttt acaaaacgtt aatttagtgt 
1321 tttaaaaata tataatttta agtaaaaata atttaaaata aaaaaataat tttgagaatc 
1381 gataattcga tcaactttga taaatatcta tcatttataa tttcatgcat ttaactgaaa 
1441 atttaaaatt actatggtac ttaattaata ataaaaatga ggaggatttt gttgttgttt 
1501 ttgagtattt tatagaataa gaatttgggc tttaatagcc tttaaagccc aatatgatca 
1561 aggccgagga aaagctgacc caaacgtaat cgagacttgt tgaagaagcc tttgccctca 
1621 tcgtcgtctt gtataataat tttggttgtg gcgcttcttt caatttgttt tcagtttcgc 
1681 catttccctc cactctcaag ctctcttttg cttctctcgc tttctctggt gacccgaatc 
1741 tgctctgatt gagagaaatg acgacgatgg agcagcgtag aaaccagaat gctgtccaac 
1801 aacaagacga tgaagaaacc cagcacggac ctttccctgt cgaacagctt caggttcaat 
1861 ttctgcagat ttctttcatt tgttgtggaa atttaccttc gcagtctata cgcttcatca 
1921 gatcttgcgc taattgaaat gggttcggct ttaattcact aaaaatttct gctttttttg 
1981 tacatcgcga tgtgaaatag gtgtgaatct gggcgagaat tgagtttcgc tactgtctgt 
2041 taggcgaatt ggttttaggg gttcgttcaa ttcagaaaaa aatcgacact ttttgggggt 
2101 tttatccttt tctggggaac ttctctattg ctgtgactcc agtgtcatat aatctctata 
2161 gttcttctca attcgtgttc ataaaaggga gatgcatcag tgtgttttgt gttatgatta 
2221 tatgctattt catctcttta aatcttcaaa acttcaggca gcaggtattg cttctgttga 
2281 tgtaaagaag cttagggatg ctggtctctg tactgttgaa ggtgttgctt atactccgag 
2341 gaaggatctc ttgcagatta aaggaattag tgatgccaag gttgacaaga ttgtagaagc 
2401 aggtattaca catgtagaaa ctttttgctt tccttctctg ttaaatacat aacacctctt 
2461 tgatactctt agagttaatg tgtgttctta tttgtgtttt cttctgtgat agcttcaaag 
2521 ctagttcctc tggggttcac tagtgcgagc cagctccatg ctcagagaca ggaaattatt 
2581 cagattacct ctggatcacg ggagctcgat aaagttctag aaggttggtg ctggtttctg 
2641 gtgcatctta taacggcatt ttgttattgt gatcttactt tgccgtatgc tcaacaggag 
2701 gtattgaaac tggttccatc acggagttat atggtgagtt ccgctctgga aagactcagc 
2761 tgtgccatac actgtgtgtg acttgtcaag taagaaccct gttcctatac cactcagtta 
2821 ttgttgctta tctagcagaa atcagtgatc ttgtctttct tcttacaatt tccaaacctg 
2881 agaaagatat tcacattaag gttcacgttt gaatgattag cttcccatgg atcaaggagg 
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2 941 tggagaggga 
3001 acagatagct 
3061 agacggaatc 
3121 atgtactaga 
3181 tgcttgaagc 
3241 gttccgttat 
3301 aggtttgctc 
3361 aggggagagc 
3421 ttagcagatg 
3481 ccatcctgaa 
3541 acaaaccaag 
3601 ccgattggtg 
3661 tcgacaaatc 
3721 atgataattg 
3781 tgataagctc 

3 841 taacagattg 
3901 tgtgcttttt 
3961 ttttgttttt 
4021 agctttgtaa 
4081 ctcgtggttt 
4141 acttaccatt 
4201 cacatttctt 
4261 catcaccatc 
4321 tcacaaattg 
4381 aaaatagttt 
4441 gacatcacca 
4501 acttttgtta 
4561 tcaagtctac 
4621 ctatatacta 

4 681 ctggataaag 
4741 cattagcggc 
4801 ttttaattta 
4861 ttttgttg 

// 
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aaggccaTgt 
gacaggtatg 
atctgtaaga 
aaacgttgcc 
agcatcaatg 
gtctatcatg 
tcctgattgt 
tttcggctcg 
a 99tgaacat 
gtattctctc 
tagttgcgca 
ggaatatcat 
catgtttctt 
atacaggttg 
gccatgtttg 
caaggattga 
tttagcaacc 
aactgttgat 
gtaaagtttt 
atacaacact 
agaatgtctt 
tttatttaac 
tccaatggtt 
tgattcgata 
atcatgtttt 
tctccaatat 
gaatgactca 
aaattatttt 
tttattctta 
agagaaagag 
cagacagctg 
aagatccctt 



acattgatgc 

ttattcttgt 

aatcaaatga 

tatgcgaggg 

atgattgaaa 

agagtgctaa 

cgatagtgct 

acaaatgcat 

tcagtactaa 

ttatgtttaa 

agtagatggt 

ggctcatgcc 

agtgtttttt 

gcgttgagga 

ccagaagcgg 

tgtttttggc 

ccttcacttg 

catagtgtaa 

tcgggtaatg 

atatattact 

atggtttatc 

aacgaaaaat 

taattagtaa 

gaagtctaat 

tgtcatacta 

tcaaaacatg 

tcctctgcat 

catttacttt 

ttaactaaag 

agagagtaca 

ttagtatata 

aaaaaaatct 



tgagggaaca 

aacacacatg 

aatgatgcag 

cgtataatac 

caaggtgtgt 

cttttaaagt 

accgctctct 

cttgcaaagt 

ctgtttcttt 

tggctctagt 

tcagctcttt 

accacaacaa 

gttagggttc 

aaggaagagc 

aagctcgatt 

cacaacaatg 

tattgatgtc 

gcgtacaagt 

aaccttttgt 

tagtgaacaa 

atgtttttta 

ccatattatt 

ttgtgattca 

tattataata 

tctcagataa 

atttgtttgt 

atatataaaa 

attatttttt 

ttactcagaa 

taacataagt 

ttgcttcgtc 

tttttgtttt 



ttcaggdTSc 
agatcacaat 
gtttggatta 
agatcatcag 
tgagtttatt 
tacgatgacc 
acagaacaga 
tcttgagaag 
tccaatattg 
ttggtgtggc 
ttgctggtcc 
ggtctgtaat 
gttaggaaag 
a 9 a 9gagaga 
tcaaatatct 
ccttgcttct 
aaaaaaccgt 
aagcgatgat 
gtttcacaaa 
tcaaatatta 
tatactaata 
acaattgtaa 
catattgtga 
aaaattgtca 
ttttcatttt 
acctttttag 
aaatcattag 
tttaccaaac 
ttataggtgt 
cctcgaaagt 
acgggctctt 
aaagaagaga 



aaagactctt 
agcgttcttg 
aatggagctg 
tcaaggcttt 

ttggtgttca 

ctattggcac 

tttctctgga 

tcttcagaag 

ctctgagaca 

tgttgttata 

ccaatttaag 

gtttgcaaat 

tgtgtttgtt 

atctgtaaag 

acagaaggtg 

ctctgcctta 

tttcttttgt 

ggagcaaaag 

cgcacacagc 

actaaccatt 

ggcatttaat 

aacataaaag 

ttctataaac 

taagaacact 

ttctgttgtt 

tattaacaaa 

tatttttata 

acatttagtt 

aactggaaaa 

gcagctattc 

ttttttttct 

gaaattgtgg 
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C 1: D14134. Human mRNA for RA... 
[gi:285976] 



ProbeSet, Related Sequences, OMIM, Protein, PubMed, Taxonomy, UniSTS, 

LinkOut 



LOCUS HUMRAD51 2229 bp mRNA linear PRI 03-FEB-1999 

DEFINITION Human mRNA for RAD51, complete cds . 

ACCESSION D14134 

VERSION D14134.1 GI: 285976 

KEYWORDS RAD51; hi stone H2A. 

SOURCE Homo sapiens ( sub_species : Caucasian) testis cDNA to mRNA, 

clone_lib:cDNA in pCD8 . 
ORGANISM Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
REFERENCE 1 (bases 1 to 222 9) 
AUTHORS Morita, T. 
TITLE Direct Submission 

JOURNAL Submitted (22 -JAN- 1993 ) Takashi Morita, Research Institute for 

Microbial Deseases, Dept. of Microbial Genetics, Osaka Univ.; 3-1 
Yamadaoka, Suita, Osaka 565, Japan (Tel : 06-877-5121 (ex. 3172) , 
Fax:06-876-2678) 
REFERENCE 2 (bases 1 to 2229) 

AUTHORS Yoshimura, Y. , Morita, T., Yamamoto,A. and Matsushiro, A. 

TITLE Cloning and sequence of the human RecA-like gene cDNA 

JOURNAL Nucleic Acids Res. 21 (7), 1665 (1993) 

MEDLINE 93241950 
COMMENT Submitted (22 -JAN- 1993 ) to DDBJ by: 

Takashi Morita 

Department of Microbial Genetics 

Research Institute for Microbial Diseases 

Osaka University 

3-1 Yamadaoka 

Suita, Osaka 565 

Japan 

Phone: 06-875-2913 
Fax: 06-876-2678. 
FEATURES Location/Qualifiers 
source 1 . .2229 

/organism="Homo sapiens" 

/ sub_spec i e s = " Cauca s i an " 

/db_xref ="taxon: 9606" 

/ tissue_type=" testis" 

/clone_lib="cDNA in pCD8" 
gene 233 . . 1252 

/gene="Rad51" 
CDS 233. .1252 

/gene="Rad51" 

/ codon_start=l 

/ produc t = " RAD5 1 " 

/protein_id^ " BAA03189. 1 " 

/db_xref ="GI : 2 85977" 

/ translation- "MAMQMQLEANADTSVEEESFGPQPISRLEQCGINANDVKKLEEA 
GFHTVEAVAYAPKKELINIKGISEAKADKILAEAAKLVPMGFTTATEFHQRRSEIIQI 
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TTGSKE^P^LQGGIETGSITEMFGEFRTGKTQICHTLA^^QLPIDRGGGEGKAMYI 

DTEGTFRPERLLAVAERYGLSGSDVLDNVAYARAFNTDHQTQLLYQASAMMVESRYAL 
LIVDSATALYRTDYSGRGELSARQMHLARFLRMLLRLADEFGVAWITNQWAQVDGA 
AMFAADPKKPIGGNIIAHASTTRLYLRKGRGETRICKIYDSPCLPEAEAMFAINADGV 
GDAKD " 

repeat region 1862.. 2173 

/rpt_unit=1862 . . 1872 
polyA signal 2208.. 2213 
BASE COUNT 593 a 472 c 602 g 562 t 

ORIGIN Chromosome 15. 

1 ccgcgcgcag cggccagaga ccgagcccta aggagagtgc ggcgcttccc gaggcgtgca 
61 gctgggaact gcaactcatc tgggttgtgc gcagaaggct ggggcaagcg agtagagaag 
121 tggagcgtaa gccaggggcg ttgggggccg tgcgggtcgg gcgcgtgcca cgcccgcggg 
181 gtgaagtcgg agcgcggggc ctgctggaga gaggagcgct gcggaccgag taatggcaat 
241 gcagatgcag cttgaagcaa atgcagatac ttcagtggaa gaagaaagct ttggcccaca 
301 acccatttca cggttagagc agtgtggcat aaatgccaac gatgtgaaga aattggaaga 
361 agctggattc catactgtgg aggctgttgc ctatgcgcca aagaaggagc taataaatat 
421 taagggaatt agtgaagcca aagctgataa aattctggct gaggcagcta aattagttcc 
4 81 aatgggtttc accactgcaa ctgaattcca ccaaaggcgg tcagagatca tacagattac 
541 tactggctcc aaagagcttg acaaactact tcaaggtgga attgagactg gatctatcac 
601 agaaatgttt ggagaattcc gaactgggaa gacccagatc tgtcatacgc tagctgtcac 
661 ctgccagctt cccattgacc ggggtggagg tgaaggaaag gccatgtaca ttgacactga 
721 gggtaccttt aggccagaac ggctgctggc agtggctgag aggtatggtc tctctggcag 
781 tgatgtcctg gataatgtag catatgctcg agcgttcaac acagaccacc agacccagct 
841 cctttatcaa gcatcagcca tgatggtaga atctaggtat gcactgctta ttgtagacag 
901 tgccaccgcc ctttacagaa cagactactc gggtcgaggt gagctttcag ccaggcagat 
961 gcacttggcc aggtttctgc ggatgcttct gcgactcgct gatgagtttg gtgtagcagt 
1021 ggtaatcact aatcaggtgg tagctcaagt ggatggagca gcgatgtttg ctgctgatcc 
1081 caaaaaacct attggaggaa atatcatcgc ccatgcatca acaaccagat tgtatctgag 
1141 gaaaggaaga ggggaaacca gaatctgcaa aatctacgac tctccctgtc ttcctgaagc 
1201 tgaagctatg ttcgccatta atgcagatgg agtgggagat gccaaagact gaatcattgg 
1261 gtttttcctc tgttaaaaac cttaagtgct gcagcctaat gagagtgcac tgctccctgg 
1321 ggttctctac aggcctcttc ctgttgtgac tgccaggata aagcttccgg gaaaacagct 
1381 attatatcag cttttctgat ggtataaaca ggagacaggt cagtagtcac aaactgatct 
1441 aaaatgttta ttccttctgt agtgtattaa tctctgtgtg ttttctttgg ttttggagga 
1501 ggggtatgaa gtatctttga catggtgcct taggaatgac ttgggtttaa caagctgtct 
1561 actggacaat cttatgtttc caagagaact aaagctggag agacctgacc cttctctcac 
1621 ttctaaatta atggtaaaat aaaatgcctc agctatgtag caaagggaat gggtctgcac 
1681 agattctttt tttctgtcag taaaactctc aagcaggttt ttaagttgtc tgtctgaatg 
1741 atcttgtgta agggtttggt tatggagtct tgtgccaaac ctactaggcc attagccctt 
1801 caccatctac ctgcttggtc tttcattgct aagactaact caagataatc ctagagtctt 
1861 aaagcatttc aggccagtgt ggtgtcttgc gcctgtactc ccagcacttt gggaggccga 
1921 ggcaggtgga tcgcttgagc caggagtttt aagtccagct tggccaagat ggtgaaatcc 
1981 catctctaca aaaaatgcag aacttaatct ggacacactg ttacacgtgc ctgtagtccc 
2041 agctactcta tagcctgagg tgggagaatc acttaagcct ggaaggtgga agttgcagtg 
2101 agtcgagatt gcactgctgc attccagcca gggtgacaga gtgagaccat gtttcaaaca 
2161 agaaacattt cagagggcaa gtaaacagat ttgattgtga ggcttctaat aaagtagtta 
2221 ttagtagtg 

// 
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□ 1: NMJ)79844. Drosophila melano...[gi: 17864107] 



Related Sequences, Protein, PubMed, Taxonomy, LinkOut 



LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
MEDLINE 
REFERENCE 
AUTHORS 



INV 15-DEC-2001 



NM_079844 1249 bp mRNA linear 

Drosophila melanogaster Rad51-like (RadBl) , mRNA. 
NM_079844 

NM 079844.1 GI:17864107 



fruit fly. 

Drosophila melanogaster 

Eukaryota; Metazoa; Arthropoda; Tracheata; Hexapoda; Insect a; 
Pterygota; Neoptera; Endopterygota; Diptera; Brachycera; 
Muscomorpha; Ephydroidea; Drosophilidae ; Drosophila . 

1 (sites) 

Akaboshi,E., Inoue,Y. and Ryo , H . 

Cloning of the cDNA and genomic DNA that correspond to the 
recA-like gene of Drosophila melanogaster 
Jpn. J. Genet. 69 (6), 663-670 (1994) 
95161094 

2 (bases 1 to 1249) 



Evans , C . A . , Gocayne , J . D . , 

Hoskins,R.A. , Galle , R . F . , 
Ashburner , M . , Henderson , S . N. 



Adams ,M.D. , Celniker , S . E . , Holt, R. A. , 
Amanatides, P.G. , Scherer,S.E. , Li,P.W. 
George , R . A . , Lewis , S . E . , Richards ,S., 
Sutton, G.G., Wortman, J.R. , Yandell,M.D. , Zhang, Q., Chen,L.X., 
Brandon, R.C. , Rogers, Y.H., Blazej,R.G., Champe,M., Pf eif f er , B . D . , 
Wan,K.H., Doyle, C, Baxter,E.G., Helt,G., Nelson, C.R., Gabor,G.L., 
Abril,J.F., Agbayani,A., An,H.J., Andrews -Pfannkoch, C . , Baldwin, D. 

Basu,A. , Baxendale, J. , Bayraktaroglu , L . , Beasley,E.M. 
Benos , P . V . , Berman , B . P . 

, Bouck, J. , 
Butler,H. , 
Cawley, S . , 
Delcher, A. 



Ballew, R . M . 
Beeson, K. Y. 
Borkova, D . , 
Burtis,K.C. 
Chandra , I . , 
Davie s , P . , 
Dietz,S.M. , 



Botchan,M.R. 
, Busam, D . A. , 

Cherry, J. M. , 
de Pablos,B. , 

Dodson,K. , Doup,L.E. , 
, Dunn, P., Durbin,K.J. 



Bhandari , D . , Bolshakov, S . 
Brokstein, P. , Brottier, P. , 
Cadieu , E . , Center , A . , 
Dahlke,C, Davenport , L . B . , 
, Deng,Z., Mays, A. D. , Dew, I. 
Downes,M. , Dugan-Rocha, S . , 
t Evangelista,C.C. , Ferraz,C, 



Dunkov , B . C . , 

Ferriera,S., Fleischmann, W . , Fosler,C, Gabrielian, A. E . , Garg,N.S. 
Gelbart, W.M. , Glasser,K., Glodek,A. , Gong,F., Gorrell , J . H . , Gu,Z., 
Guan,P., Harris, M., Harris, N.L., Harvey, D., Heiman,T.J., 
Hernandez, J.R. , Houck,J., Hostin,D., Houston, K. A . , Howl and, T . J. , 
Wei,M.H., Ibegwam,C, Jalali,M., Kalush,F., Karpen,G.H., Ke,Z., 
Kennison, J. A. , Ketchum, K. A. , Kimmel,B.E., Kodira,C.D., Kraft, C. , 
Kravitz,S., Kulp,D., Lai,Z., Lasko,P., Lei,Y., Levitsky , A. A. , 
Li, J., Li,Z., Liang, Y. , Lin,X., Liu,X., Mattei,B., Mcintosh, T . C . , 
McLeod,M.P., McPherson, D . , Merkulov,G., Milshina , N . V . , Mobarry,C, 
Morris, J., Moshrefi,A., Mount, S.M., Moy,M. , Murphy, B., Murphy, L., 
Muzny , D . M . , Nelson, D.L., Nelson, D.R., Nelson, K. A., Nixon, K. , 
Nusskern,D.R. , Pacleb,J.M., Palazzolo, M . , Pittman , G . S . , Pan,S., 
Pollard, J., Puri,V., Reese, M.G., Reinert,K., Remington, K. , 
Saunders, R. D. , Scheeler,F., Shen,H., Shue,B.C, Siden-Kiamos , I . , 
Simpson, M. , Skupski , M. P . , Smith, T. , Spier, E., Spradling, A . C . , 
Stapleton,M. , Strong, R., Sun,E., Svirskas,R., Tector,C, Turner, R. 
Venter, E., Wang, A. H. , Wang,X., Wang,Z.Y., Wassarman, D . A. , 
Weinstock,G.M. , Weissenbach, J. , Williams, S .M. , WoodageT, 
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, Wu,^r\ Yang,S., Yao , Q . A . , Ye, J., Yefi^.F. 
, Zhan,M., Zhang, G., Zhao,Q. , Zheng, L., Zheng, X.H. 
Zhong,W., Zhou,X., Zhu,S., Zhu,X., Smith,H.O., 
Myers,E.W., Rubin, G . M . and Venter, J. C. 
sequence of Drosophila melanogaster 
5461), 2185-2195 (2000) 



misc feature 



misc feature 



BASE COUNT 
ORIGIN 

1 
61 
121 
181 
241 
301 
361 
421 
481 
541 
601 



Worley, K.C 
Zaveri , J . S 
Zhong, F.N. 
Gibbs,R.A. 
The genome 
Science 287 
20196006 
10731132 

3 (sites) 

Akaboshi , E . and Inoue , Y . 

Cloning of the Drosophila RecA-like cDNA 
Unpublished (1994) 

4 (bases 1 to 1249) 
Akaboshi , E . 
Unpublished (1995) 

PROVISIONAL REFSEQ: This record has not yet been subject to final 
NCBI review. The reference sequence was derived from D17726 . 1 . 

Location/ Qualifiers 

1 . . 1249 

/organism= "Drosophila melanogaster" 

/db_xre f = " t axon : 7 2 2 7 " 

/chromosome^ "3 n 

/map="99D5-99D7" 

/cell_line="SICII (GM1) » 

1 . . 1249 

/gene="Rad51" 

/note="CG7948; RAD51; DmRadSl; RadSldm; DMR/ DroRADS 1 ; 

DMR1 ; RA51; CT63 89" 

/ db_xre f = " FLYBASE : FBgn0011700 ; " 

/db_xre f =" Locus ID : 43577 " 

63 . . 1073 

/gene="Rad51" 

/EC_number= "3.6.1.3" 

/codon_start-l 

/db xref=" FLYBASE : FBgn0011700 ; " 
/ db__xre f = " Locus ID : 43577 " 
/produc t = "Rad5 1 - like 11 
/protein id=" NP 524583.1 " 
/dbjxref ="GI : 17864108" 

/ trans la tion="MEKLTNVQAQQEEEEEEGPLSVTKLIGGSITAKDIKLLQQASLH 
TVESVANATKKQLMAIPGLGGGKVEQIITEANKLVPLGFLSARTFYQMRADWQLSTG 
S KELDKLLGGGI ETGS ITE I FGEFRCGKTQLCHTLAVTCQLPI SQKGGEGKCMY IDTE 
NTFRPERLAAIAQRYKLNESEVLDNVAFTRAHNSDQQTKLIQMAAGMLFESRYALLIV 
DSAMALYRSDYIGRGELAARQNHLGLFLRMLQRLADEFGVAWITNQVTASLDGAPGM 
FDAKKPIGGHIMAHSSTTRLYLRKGKGETRICKIYDSPCLPESEAMFAILPDGIGDAR 
ES" 

432 . .455 

/note= "nucleotide binding consensus sequence (A box) " 
_ 705. .719 

/note= "nucleotide binding consensus sequence (B box) " 
322 a 315 c 338 g 274 t 



gtacaattta aataacgaaa ctcgaaacaa caaatagcca aatttaactt cgaaactgtc 

aaatggagaa gctaacgaat gttcaggcac agcaggaaga ggaggaggaa gaaggtccac 

tcagcgtgac taagttaata ggcggcagca tcacggccaa ggacatcaag ctgctccagc 

aggccagtct gcacaccgtg gagtcggtgg ccaatgccac caagaagcaa ctgatggcca 

ttcccggctt gggcggcggc aaggtggagc agatcatcac ggaggccaac aaactggtgc 

ctctgggctt ccttagtgcc cgcaccttct atcaaatgcg tgccgatgtt gtgcagctga 

caaggagctg gacaaactgc tgggcggcgg cattgagacg ggatccatta 

cggcgagttc cgctgcggaa agacgcaatt gtgccacact ctggcagtaa 

cctgccagct gcccatcagc cagaagggcg gcgagggcaa gtgcatgtac atcgacacgg 

agaacacctt ccgtccggag cgtttggcag ccatcgcgca aaggtacaaa ctgaatgaat 

ccgaggtgct ggacaatgtg gccttcaccc gtgcccacaa ctcagatcag cagaccaagc 



gcacgggctc 
ccgagatctt 
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661 tcatccagat ggcggcg^c atgctctttg agtccagata cgctttgWg attgtggaca 



// 



721 gtgccatggc gctctacaga tccgattata ttggtcgcgg ggagctggcc gccaggcaaa 
781 accatttggg cttatttctg cgcatgcttc aacgcctggc cgatgagttc ggagtggctg 
tggtaattac taaccaggtt actgcctcgc tggacggcgc acccggcatg tttgatgcca 



901 agaagcccat tggcgggcac atcatggccc actcctcaac cacgcggctg tatctgcgca 

961 agggtaaagg cgaaacccgc atctgcaaga tctacgactc gccctgtttg ccggaatcgg 

1021 aggccatgtt cgccattctg ccggatggaa taggagacgc cagggagagc taattgtgct 

1081 cacttatagg ttaattaatg ctaggaatag agctccctaa ctttctaatg attacttatc 

1141 cctagactaa gaaataaact tcgtagattg tttttttttt ttgttttaat gtttacttag 

1201 aatgtttagg ataacgacga aataaaacag ccgttgcgaa ttgtatgta 
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(gi|6598365:48046-48220, 48625-48744, 48827- 
48860, 48943-49013, 49092-49322, 49436-49560, 
49646-49712, 49809-49984) 



LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



AC002387 999 bp DNA PLN 05-APR-200 

CDS from: Arabidopsis thai i ana chromosome II section 242 of 

the complete sequence. Sequence from clones T14P1, F4L23. 

AC002387 

AC002387.2 

HTG. 

thale cress. 
Arabidopsis thaliana 

Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheo 
Spermatophyta; Magnoliophyta ; eudicotyledons ; core eudicots; 
Rosidae; eurosids II; Brassicales; Brassicaceae ; Arabidopsis 

1 (bases 48046 to 49984) 

Lin,X., Kaul,S., Rounsley, S .D . , Shea, T. P., Benito, M. -I . , Tow 
Fujii # C.Y. , Mason, T.M. , Bowman, C.L. , Barnstead, M . E . , 
Feldblyum,T.V. , Buell,C.R., Ketchum, K. A. , Lee, J. J., Ronning, 
Koo,H., Moffat, K.S., Cronin,L.A., Shen,M., VanAken , S . E . , Uma 
Tallon,L.J., Gill, J. E., Adams, M.D., Carrera, A. J. , Creasy, T.H 
Goodman, H.M. , Somerville , C . R . , Copenhaver , G . P . , Preuss,D. , 
Nierman,W.C. , White,0., Eisen,J.A., Salzberg, S . L . , Fraser,C. 
Venter, J. C. 

Sequence and analysis of chromosome 2 of the plant Arabidops 
thaliana 

Nature 4 02 (6763), 761-768 (1999) 

20083487 

10617197 

2 (bases 48046 to 49984) 
Lin,X. 

Direct Submission 

Submitted (09-MAR-2000) The Institute for Genomic Research, 
Medical Center Dr., Rockville, MD 20850, USA 
On Dec 17, 1999 this sequence version replaced gi : 2583106 . 
The sequence and annotation of chromosome 2 were merged from 
of the individual clones on this chromosome after removing 
overlaps. For detailed information, please see the TIGR web 
(http : //www . tigr . org/tdb/at/at . html ) . 

Genes were identified by a combination of three methods: Gen 
prediction programs including GRAIL 

(ftp://arthur.epm.ornl.gov/pub/xgrail), Genefinder (Phil Gre 
University of Washington), Genscan (Chris Burge, 
http://gnomic.stanford.edu/GENSCANW.html) , and NetPlantGene 
(http://www.cbs.dtu.dk/services/NetGene2/), searches of the 
complete sequence against a peptide database and plant EST 
databases at TIGR, and manual curations based on those analy 
Annotated genes are named to indicate the level of evidence 
their annotation. Genes with similarity to other proteins ar 
after the database hits. Genes without significant peptide 
similarity but with EST similarity are named as 'unknown' pr 
Genes without protein or EST similarity, that are predicted 
or more gene prediction programs over most of their length a 
annotated as 'hypothetical' proteins. Genes encoding tRNAs a 
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^^redicted by tRNAscan-SE (Sean^Jfdy, 

http : //genome .wustl . edu/eddy/tRNAscan-SE/ 
identified by repeatmasker (Arian Smit, 
ht tp : / /f tp . genome . Washington . edu/RM/RepeatMasker . html ) 
numbered from the top to bottom of the chromosome. 



Page 2 of 2 



Simple repeats w 



Gene 



FEATURES 

misc_ 
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gene 

CDS 



BASE COUNT 
ORIGIN 

1 



We thank the CSHL/WashU/ABI consortium for sequencing BAC cl 
F6P23, F5J6, T17A5, and T13L16, the ESSA group for sequencin 
F13D4, and Scott Jackson, Jiming Jiang, Klaus Meyer, Eric Ri 
and Satoshi Tabata for helpful assistance. In addition, we w 
like to thank the TIGR Bioinf ormatics Department, especially 
Zhou, Hanif Khalak, Michael E. Heaney, Lily Fu, Feng Liang, 
Peterson, Michael Holmes, and Delwood Richardson for softwar 
database support. 

This work was supported by the National Science Foundation, 
Department of Energy and the US Department of Agriculture. 

Address all correspondence to: at@tigr.org. 
Location/Qualifiers 
feature complement (<1 . . >999) 

/note-" Sequence from clone F4L2 3" 
<1. .>999 

/gene="At2g45280" 
<1. .>999 

/gene="At2g452 80" 
/note="F4L23 .21" 
1. .999 

/gene="At2g45280" 
/ codon_start=l 

/product^ "putative RAD51C-like DNA repair protein" 
/protein id=" AAB82635 . 1 " 
/db_xref = "GI : 2583126 " 

/translation "MISFGRRKSPAIEETSLATSVMEAWRLPLSPSIRGKL 
LSSIASVSSSDLARAKNAWDMLHEEESLPRITTSCSDLDNILGGGISCRDV 
G I GKTQ IG I QLS VNVQI PRE CGGLGGKA I Y I DTEGS FMVERALQ I AE ACVE 
YMHKHFQANQVQMKPEDI LENI FYFRVCS YTEQ I ALVNHLEKFI SENKDW 
FHFRQDYDDLAQRTRVLSEMALKFMKLAKKFSLAWLLNQVTTKFSEGSFQ 
SWSHSCTNRVILYWNGDERYAYIDKSPSLPSASASYTVTSRGLRNSSSSSK 
272 a 203 c 243 g 281 t 



// 



atgatttcat ttgggcggcg taaatcgccg gcgattgaag aaacttcact cgcgact 

gtcatggagg catggaggtt accgttatcg ccttcgatta gaggaaaact gatatcg 

ggttatactt gtctgtcttc gattgcttcc gtctcttctt ctgatctcgc tcgagca 

aacgcttggg atatgcttca cgaggaggag tctttgccgc gtattactac atcttgc 

gatcttgata acattttggg cggtggaatt agctgtaggg atgttacaga gattggt 

gtaccaggga ttggcaagac tcagattggg atccagctct ctgtgaatgt tcagatt 

cgtgagtgtg gtggtcttgg agggaaagct atatatatcg atacagaagg tagcttc 

gtggagcgtg ctttacagat agcagaagct tgtgtagagg acatggaaga atacaca 

tacatgcata aacattttca agcaaatcaa gtacaaatga aaccagaaga tatctta 

aacatattct acttccgtgt ctgcagttac accgagcaaa tcgcattggt caatcat 

601 gaaaagttca tctctgaaaa caaagatgta gttgtaatcg tagacagtat caccttt 

661 ttccgtcagg actatgatga cttagcccag aggacacgag tgctcagcga aatggct 

aagttcatga agcttgccaa aaagttctca cttgcggtcg tgttactaaa ccaggtg 

acaaagttta gtgaaggctc gtttcaacta gcgcttgctt taggcgatag ctggtct 

tcgtgcacca accgagtcat tctgtattgg aatggtgatg agcgttacgc atatatc 

aagtcccctt cacttccttc agcttcggct tcatacactg taaccagtag aggtcta 

aactcatcct cgagtagcaa gcgagtcaag atgatgtaa 



61 
121 
181 
241 
301 
361 
421 
481 
541 



721 
781 
841 
901 
961 
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Add to Clipboard 



r 1 : AF029669. Homo sapiens Rad5...[gi:2909800] Related Sequences, OMIM, Protein, PubMed, Taxonomy, LinkOut 

LOCUS AF029669 1295 bp mRNA linear PRI 24-FEB-1998 

DEFINITION Homo sapiens Rad51C (RAD51C) mRNA, complete cds . 
ACCESSION AF02 9669 

VERSION AF029669.1 GI: 2909800 

KEYWORDS 

SOURCE human . 

ORGANISM Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
REFERENCE 1 (bases 1 to 12 95) 

AUTHORS Dosanjh,M.K. , Collins , D . W . , Fan,W., Lennon , G . G . , Albala,J.S., 

Shen,Z. and Schild,D. 
TITLE Isolation and characterization of RAD51C, a new human member of the 

RAD51 family of related genes 
JOURNAL Nucleic Acids Res. 26 (5), 1179-1184 (1998) 
MEDLINE 98136197 
REFERENCE 2 (bases 1 to 1295) 

AUTHORS Schild,D., Collins, D.W. and Dosanjh, M . K. 
TITLE Direct Submission 

JOURNAL Submitted (09 -OCT- 1997) Life Sciences, Lawrence Berkeley National 
Laboratory, Ms. 70A-1118, Berkeley, CA 94720, USA 
FEATURES Location/ Qualifiers 

source 1 . . 1295 

/organism- "Homo sapiens" 
/ db__xr e f = " t axon : 9 6 0 6 " 
/chromosome= " 17 " 

/map="17q; 413.6cR from the top" 

/eel l_type=" leukocyte (mixed population)" 
gene 1 . . 1295 

/gene="RAD51C" 
CDS 43.. 1173 

/gene="RAD51C n 

/note="member of the Rad51 family of related proteins" 

/codon_start=l 

/product^ "RadSIC" 

/protein_id= " AAC39604.1 " 

/db_xref="GI : 2909801" 

/ translation= "MRGKTFRFEMQRDLVSFPLSPAVRVKLVSAGFQTAEELLEVKPS 
ELSKEVGISKAEALETLQIIRRECLTNKPRYAGTSESHKKCTALELLEQEHTQGFIIT 
FCSALDD I LGGG VPLMKTTE I CGAPGVGKTQLCMQLAVD VQ I PEC FGGVAGE AVF I DT 
EGSFMVDRWDLATACIQHLQLIAEKHKGEEHRKALEDFTLDNILSHIYYFRCRDYTE 
LLAQVYLLPDFLSEHSKVRLVIVDGIAFPFRHDLDDLSLRTRLLNGLAQQMISLANNH 
RLAV I LTNQMTTKI DRNQALL VPALGE S WGHAAT I RL I FHWDRKQRLATL YKS P S QKE 
CTVLFQIKPQGFRDTWTSACSLQTEGSLSTRKRSRDPEEEL" 

BASE COUNT 392 a 249 c 308 g 346 t 

ORIGIN 

1 gtgcggagtt tggctgctcc ggggttagca ggtgagcctg egatgegegg gaagacgttc 
61 cgctttgaaa tgcagcggga tttggtgagt ttcccgctgt ctccagcggt gcgggtgaag 
121 ctggtgtctg eggggttcca gaetgetgag gaactcctag aggtgaaacc ctccgagctt 
181 agcaaagaag ttgggatatc taaagcagaa gecttagaaa etctgeaaat tatcagaaga 
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241 gaatgtctca caaataa^R: aagatatgct ggtacatctg agtcaca^Ba gaagtgtaca 
301 gcactggaac ttcttgagca ggagcatacc cagggcttca taatcacctt ctgttcagca 
361 ctagatgata ttcttggggg tggagtgccc ttaatgaaaa caacagaaat ttgtggtgca 
421 ccaggtgttg gaaaaacaca attatgtatg cagttggcag tagatgtgca gataccagaa 
4 81 tgttttggag gagtggcagg tgaagcagtt tttattgata cagagggaag ttttatggtt 
541 gatagagtgg tagaccttgc tactgcctgc attcagcacc ttcagcttat agcagaaaaa 
601 cacaagggag aggaacaccg aaaagctttg gaggatttca ctcttgataa tattctttct 
661 catatttatt attttcgctg tcgtgactac acagagttac tggcacaagt ttatcttctt 
721 ccagatttcc tttcagaaca ctcaaaggtt cgactagtga tagtggatgg tattgctttt 
781 ccatttcgtc atgacctaga tgacctgtct cttcgtactc ggttattaaa tggcctagcc 
841 cagcaaatga tcagccttgc aaataatcac agattagctg taattttaac caatcagatg 
901 acaacaaaga ttgatagaaa tcaggccttg cttgttcctg cattagggga aagttgggga 
961 catgctgcta caatacggct aatctttcat tgggaccgaa agcaaaggtt ggcaacattg 
1021 tacaagtcac ccagccagaa ggaatgcaca gtactgtttc aaatcaaacc tcagggattt 
1081 agagatactg ttgttacttc tgcatgttca ttgcaaacag aaggttcctt gagcacccgg 
1141 aaacggtcac gagacccaga ggaagaatta taacccagaa acaaatctca aagtgtacaa 
1201 atttattgat gttgtgaaat caatgtgtac aagtggactt gttaccttaa agtataaata 
12 61 aacacactat ggcatgaatg aaaaaaaaaa aaaaa 

// 
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r 1: U84138. Homo sapiens DNA ...[gi:2801404] Related Sequences, OMIM, Protein, PubMed, Taxonomy, LinkOut 

LOCUS HSU84138 1764 bp mRNA linear PRI 05-MAY-1998 

DEFINITION Homo sapiens DNA repair protein RAD51B mRNA, complete cds . 
ACCESSION U84138 

VERSION U84138 . 1 GI : 2801404 

KEYWORDS 

SOURCE human . 

ORGANISM Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia ; Eutheria; Primates ; Catarrhini ; Hominidae ; Homo . 
REFERENCE 1 (bases 1 to 1764) 

AUTHORS Albala,J.S., Thelen,M.P., Prange,C, Fan,W., Christensen, M . , 

Thompson, L .H. and Lennon,G.G. 
TITLE Identification of a novel human RAD51 homolog, RAD51B 

JOURNAL Genomics 46 (3), 476-479 (1997) 
MEDLINE 98110585 
REFERENCE 2 (bases 1 to 1764) 

AUTHORS Albala,J.S., Prange,C.K., Fan,*!., Christensen, M . , Thelen,M. and 

Lennon,G.G. 
TITLE Direct Submission 

JOURNAL Submitted (07-JAN-1997) Biology and Biotechnology Research Program, 
Lawrence Livermore National Laboratory, 7000 East Avenue, 
Livermore, CA 94550, USA 
FEATURES Location/Qualifiers 
source 1 . . 1764 

/organism="Homo sapiens" 
/db_xre f = " t axon : 9 6 0 6 " 
/chromosome= " 14 " 
/map= " 14q2 3 - q2 4 . 2 " 
5 'UTR 1..54 
CDS 55.. 1107 

/codon_start=l 

/product="DNA repair protein RAD51B" 
/protein_id= " AAC39723 .1 " 
/db_xref="GI: 2801405" 

/ trans lation= "MGSKKLKRVGLSQELCDRLSRHQILTCQDFLCLSPLELMKVTGL 
SYRGVHELLCMVSRACAPKMQTAYGIKAQRSADFSPAFLSTTLSALDEALHGGVACGS 
LTEITGPPGCGKTQFCIMMSILATLPTNMGGLEGAWYIDTESAFSAERLVEIAESRF 
PRYFNTEEKLLLTSSKVHLYRELTCDEVLQRIESLEEEIISKGIKLVILDSVASWRK 
EFDAQLQGNLKERNKFLAREASSLKYLAEEFSIPVILTNQITTHLSGALASQADLVSP 
ADDLSLSEGTSGSSCVIAALGNTWSHSVNTRLILQYLDSERRQILIAKSPLAPFTSFV 
YTIKEEGLVLQAYGNS " 
3 ' UTR 1108.. 1764 

polyA signal 1727.. 1732 

BASE COUNT 513 a 361 c 387 g 503 t 

ORIGIN 

1 gggaaactgt gtaaagggtg gggaaacttg aaagttggat gctgcagacc cggcatgggt 
61 agcaagaaac taaaacgagt gggtttatca caagagctgt gtgaccgtct gagtagacat 
121 cagatcctta cctgtcagga ctttttatgt ctttccccac tggagcttat gaaggtgact 
181 ggtctgagtt atcgaggtgt ccatgaactt ctatgtatgg tcagcagggc ctgtgcccca 
241 aagatgcaaa cggcttatgg gataaaagca caaaggtctg ctgatttctc accagcattc 
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301 ttatctacta ccctttc^^: tttggacgaa gccctgcatg gtggtg^^c ttgtggatcc 
361 ctcacagaga ttacaggtcc accaggttgt ggaaaaactc agttttgtat aatgatgagc 
421 attttggcta cattacccac caacatggga ggattagaag gagctgtggt gtacattgac 
481 acagagtctg catttagtgc tgaaagactg gttgaaatag cagaatcccg ttttcccaga 
541 tattttaaca ctgaagaaaa gttacttttg acaagtagta aagttcatct ttatcgggaa 
601 ctcacctgtg atgaagttct acaaaggatt gaatctttgg aagaagaaat tatctcaaaa 
661 ggaattaaac ttgtgattct tgactctgtt gcttctgtgg tcagaaagga gtttgatgca 
721 caacttcaag gcaatctcaa agaaagaaac aagttcttgg caagagaggc atcctccttg 
781 aagtatttgg ctgaggagtt ttcaatccca gttatcttga cgaatcagat tacaacccat 
841 ctgagtggag ccctggcttc tcaggcagac ctggtgtctc cagctgatga tttgtccctg 
901 tctgaaggca cttctggatc cagctgtgtg atagccgcac taggaaatac ctggagtcac 
961 agtgtgaata cccggctgat cctccagtac cttgattcag agagaagaca gattcttatt 
1021 gccaagtccc ctctggctcc cttcacctca tttgtctaca ccatcaagga ggaaggcctg 
1081 gttcttcaag cctatggaaa ttcctagaga cagataaatg tgcaaacctg ttcatcttgc 
1141 caagaaaaat ccgctttttt gccacagaaa caaaatattg ggaaagagtc ttgtggtgaa 
1201 acacccatcg ttctttgcta aaacatttgg ttgctactgt gtagactcag cttaagtcat 
1261 ggaattctag aggatgtatc tcacaagtag gatcaagaac aagcccaaca gtaatctgca 
1321 tcataagctg atttgatacc atggcactga caatgggcac tgatttgata ccatggcact 
13 81 gacatgggca cacagggaac aggaaatggg aatgagagca agggttgggt tgtgttcgtg 
1441 gaacacatag gttttttttt tttaactttc tctttctaaa atatttcatt ttgatggagg 
1501 tgaaatttat ataagatgaa attaaccatt ttaaagtaaa caattccgtg gcaactagat 
1561 atcatgatgt gcaaccagca tctctgtcta gttccaaata ttttcatcac cccaaaagca 
1621 agacccataa ccattatgca agtgttccta tttccccctc ctcccagctc ctggaaaccc 
1681 accaatctac tttgttgcta tggctttacc tattctggat atttcatata aatggaatca 
1741 tatagtgtca taaaaaaaaa aaaa 

// 
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corn RadSIC SEQ ID NO: 2 encoded by SEQ ID NO: 1 (elected) 

corn RadSIC SEQ ID NO: 6 encoded by SEQ ID NO: 5 

corn RadSIC SEQ ID NO: 4 encoded by SEQ ID NO: 3 

A. thaliani RadSIC protein encoded by GB AC002387 (F4L23.21) 

Human RadSIC protein encoded by GB AF029669 

Archaea RadA/RadSIC protein encoded by partial cds GB AF090211 
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Moderate stripping solution Nucleotide mix 
200 mM TrisCl, pH 7.0 2.5 mM ATP 

O.lx SSC {appendix 2) 2.5 mM CTP 

0. 1 % (w/v) SDS 2.5 mM GTP 



20 mM Tris-Cl, pH7.5 
Store at -20°C 



COMMENTARY 

Background Information 

Hybridization between complementary 
polynucleotides was implicit in the Watson- 
Crick model for DNA structure and was ini- 
tially exploited, via renaturation kinetics, as a 
means of studying genome complexity. In these 
early applications, the two hybridizing mole- 
cules were both in solution — an approach that 
is still utilized in "modern" techniques such 
nuclease protection transcript mapping (units 
4.6&4,7) and oligonucleotide-directed mutagen- 
esis (Chapter 8). The innovative idea of im- 
mobilizing one hybridizing molecule on a solid 
support was first proposed by Denhardt (1966) 
and led to methods for identification of specific 
sequences in genomic DNA (dot blotting; unit 
2.9B) and recombinant clones (units 6.3 & 6.4). A 
second dimension was subsequently intro- 
duced by Southern (1975), who showed how 
DNA molecules contained in an electrophore- 
sis gel could be transferred to a membrane (unit 
2.9a), enabling genetic information relating to 
individual restriction fragments to be obtained 
by hybridization analysis. 

Since the pioneering work of Denhardt and 
Southern, advances in membrane hybridiza- 
tion have been technical rather than conceptual. 
As reviewed by Dyson (1991), the detailed 
protocols have become more sophisticated, 
largely because of advances in understanding 
of the factors that influence hybrid stability and 
hybridization rate. 

Hybrid stability is expressed as the melting 
temperature or T m , which is the temperature at 
which the probe dissociates from the target 
DNA. For DNA-DNA hybrids, the T m can be 
approximated from the equation of Meinkoth 
and Wahl (1984): 

T m = 81.5'C + 16.6 (log M) + 0.41(%GC) - 
0.61 (%form)-50°/L 

and for RNA-DNA hybrids from the equation 
of Casey and Davidson (1977): 

T m = 79.8°C + 18.5(log M) + 0.58(%GC) - 

1 1 .8(%GC) 2 - 0.56(%form) - 820/i 

where M is the molarity of monovalent cations, 
%GC is the percentage of guanosine and cyto- 



sine nucleotides in the DNA, %form is the 
percentage of formamide in the hybridization 
solution, and L is the length of the hybrid in 
base pairs. The practical considerations that 
arise from these two equations are summarized 
Table 2.10.2A. 

The second important consideration in hy- 
bridization analysis is the rate at which the 
hybrid is formed. Hybrid formation cannot 
occur until complementary regions of the two 
molecules become aligned, which occurs 
purely by chance; however, once a short nucle- 
ating region of the duplex has formed, the 
remaining sequences base-pair relatively rap- 
idly. The rate at which the probe "finds" the 
target, which is influenced by a number of 
factors (Table 2.10.2B), is therefore the limit- 
ing step in hybrid formation (Britten and Da- 
vidson, 1985). However, in practical terms hy- 
bridization rate is less important than hybrid 
stability, as in most protocols hybridization is 
allowed to proceed for so long that factors 
influencing rate become immaterial. 

Critical Parameters 

To be successful, a hybridization experi- 
ment must meet two criteria: 

(1) Sensitivity. Sufficient probe DNA must 
anneal to the target to produce a detectable 
signal after autoradiography. 

(2) Specificity. After the last wash, the probe 
must be attached only to the desired target 
sequence (or, with heterologous probing, fam- 
ily of sequences). 

Parameters influencing these two criteria 
are considered in turn, followed by other mis- 
cellaneous factors that affect hybridization. 

Factors influencing sensitivity 

The sensitivity of hybridization analysis is 
determined by how many labeled probe mole- 
cules attach to the target DNA. The greater the 
number of labeled probe molecules that anneal, 
the greater the intensity of the hybridization 
signal seen after autoradiography. 

Probe specific activity. Of the various fac- 
tors that influence sensitivity, the one that most 
frequently causes problems is the specific ac- 
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tivity of the probe. Modern labeling proce- 
dures, whether nick translation, random oligo- 
nucleotide priming (tw/r 5.5), or in vitro RNA 
synthesis (alternate protocol), routinely pro- 
vide probes with a specific activity of >10 8 
dpm/ng. This is the minimum specific activity 
that should be used in hybridization analysis of 
genomic DNA, even if the target sequences are 
multicopy. If the specific activity is <10 8 
dpm/ng, hybridization signals will be weak or 
possibly undetectable, and no amount of ad- 
justing the hybridization conditions will com- 
pensate. If there is a problem in obtaining a 
specific activity of >10 8 dpm/p-g, it is important 
to troubleshoot the labeling protocol before 
attempting to use the probe in hybridization 
analysis. 

If the probe is labeled to 10 8 to 10 9 dpm/^ig, 
it will be able to detect as little as 0.5 pg of 
target DNA. Exactly what this means depends 
on the size of the genome being studied and the 
copy number of the target sequence . For human 
genomic DNA, 0.5 pg of a single-copy se- 
quence 500 bp in length corresponds to 3.3 |Lig 



• 

of total DNA. This is therefore the minimum 
amount of human DNA that should be used in 
a dot blot or Southern transfer if a single-copy 
gene is being sought. 

Amount of target DNA. There is, however, 
a second argument that dictates that rather more 
than 3.3 \ig of DNA should be loaded with each 
dot or Southern blot. During hybridization, 
genuine target sequences (100% homologous 
to the probe) and heterologous target sequences 
(related but not identical to the probe) compete 
with one another, with the homologous reac- 
tions always predominant. Ideally this compe- 
tition should be maintained until the end of the 
incubation period so that maximum discrimi- 
nation is seen between homologous and heter- 
ologous signals. This occurs only if the mem- 
brane-bound DNA is in excess, so that target 
sequences are continually competing for the 
available probe (Anderson and Young, 1985). 
If the probe is in excess then the homologous 
reaction may reach completion (i.e., all genuine 
target sites become filled) before the end of the 
incubation, leaving a period when only 



Table 2.10.2 Factors Influencing Hybrid Stability and Hybridization Rate* 



Factor 



Influence 



A. Hybrid stability* 

Ionic strength 

Base composition 

Destabilizing agents 

Mismatched base pairs 
Duplex length 

B. Hybridization rate* 
Temperature 

Ionic strength 
Destabilizing agents 

Mismatched base pairs 

Duplex length 
Viscosity 

Probe complexity 
Base composition 
pH 



T m increases 16.6°C for each 10-fold increase in monovalent cations 
between 0.01 and 0.40 M NaCl 

AT base pairs are less stable than GC base pairs in aqueous solutions 
containing NaCl 

Each 1% of formamide reduces the T m by about 0.6°C for a DNA-DNA 
hybrid. 6 M urea reduces the T m by about 30°C 
T m is reduced by 1°C for each 1% of mismatching 
Negligible effect with probes >500 bp 

Maximum rate occurs at 20-25°C below T m for DNA-DNA hybrids, 

10- 15°C below T m for DNA-RNA hybrids 

Optimal hybridization rate at 1.5 M Na + 

50% formamide has no effect, but higher or lower concentrations 

reduce the hybridization rate 

Each 10% of mismatching reduces the hybridization rate by a 
factor of two 

Hybridization rate is directly proportional to duplex length 

Increased viscosity increases the rate of membrane hybridization; 10% 

dextran sulfate increases rate by factor of ten 

Repetitive sequences increase the hybridization rate 

Little effect 

Little effect between pH 5.0 and pH 9.0 
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This table is based on Brown (1991) with permission from BIOS Scientific Publishers. 
*There have been relatively few studies of the factors influencing membrane hybridization. In several instances 
extrapolations are made from what is known about solution hybridization. This is probably reliable for hybrid 
stability, less so for hybridization rate. 
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heterologous hybridization^Poccurring and 

during which discrimination between the ho- 
mologous and heterologous signals becomes 
reduced. The problem is more significant with 
a double-stranded rather than a single-stranded 
probe, as with double- stranded probe reanneal- 
ing between the two probe polynucleotides 
gradually reduces the effective probe concen- 
tration to such an extent that it always becomes 
limiting towards the end of the incubation. 

In practical terms it is difficult to ensure that 
the membrane-bound DNA is in excess. The 
important factor is not just the absolute amount 
of DNA (which is dependent on the efficiency 
of immobilization and how many times the 
membrane has been reprobed) but also the 
proportion of the DNA that is composed of 
sequences (homologous and heterologous) 
able to hybridize to the probe. Rather than 
attempting complex calculations whose results 
may have factor-of-ten errors, it is advisable 
simply to blot as much DNA as possible: 1 0 |ig 
is sufficient with most genomes. Assuming that 
the probe is labeled adequately and used at the 
correct concentration in the hybridization so- 
lution, a clear result will be obtained after 
autoradiography for a few hours with a simple 
genome (e.g., yeast DNA) or a few days with 
a more complex one (e.g., human DNA). 

Labels other than n P. The discussion so far 
has assumed that the probe is labeled with 32 P. 
The lower emission energy of 35 S results in 
reduced sensitivity, and this isotope is in gen- 
eral unsuitable for hybridization analysis of 
genomic DNA. 35 S can be used only if the 
blotted DNA is exceptionally noncomplex 
(e.g., restricted plasmid DNA), or if the DNA 
is highly concentrated (e.g., colony and plaque 
blots; unit 6.3). Note that a membrane hybrid- 
ized to a 35 S-labeled probe has to be dried 
before autoradiography, so probe stripping is 
not possible. 

Nonradioactive probes are a more realistic 
option for hybridization analysis of genomic 
DNA and are becoming increasingly popular 
as the problems involved in their use are grad- 
ually ironed out. Their advantages include 
greater safety, the fact that large amounts of 
probe can be prepared in one batch and stored 
for years, and the rapidity of the detection 
protocols. Their main disadvantage is that the 
sensitivity of most nonradioactive detection 
systems is lower than that of 32 P autoradiogra- 
phy, which means that the blot and hybridiza- 
tion have to be carried out at maximum effi- 
ciency if a satisfactory signal is to be seen. For 
details on hybridization analysis with nonradi- 



oactive probes, see VNrrs3^K.3.i9 and Mundy 
et al. (1991). 

Using an inert polymer to increase sensitiv- 
ity. In addition to adjusting the parameters dis- 
cussed above, an improvement in sensitivity 
can also be achieved by adding an inert polymer 
such as 10% (w/v) dextran sulfate (molecular 
weight 500,000) or 8% (w/v) PEG 6000 to the 
hybridization solution. Both induce an increase 
in hybridization signal, 10-fold with a single- 
stranded probe and as much as 100-fold if the 
probe is double-stranded (Wahl et al., 1979; 
Amasino, 1986). The improvement is thought 
to arise from formation of interlocked meshes 
of probe molecules, which anneal en masse at 
target sites. Increased hybridization signals are 
a major bonus in detecting single-copy se- 
quences in complex genomes, but this must be 
balanced with the fact that the polymers make 
the hybridization solutions very viscous and 
difficult to handle. 

Factors influencing specificity 

Ensuring specificity in homologous hybrid- 
ization experiments. The hybridization incuba- 
tion is carried out in a high-salt solution that 
promotes base -pairing between probe and tar- 
get sequences. In 5x SSC, the T m for genomic 
DNA with a GC content of 50% is about 96°C. 
Hybridization is normally carried out at 68°C, 
so the specificity of the experiment is not de- 
termined at this stage. Specificity is the func- 
tion of post-hybridization washes, the critical 
parameters being the ionic strength of the final 
wash solution and the temperature at which this 
wash is carried out. 

The highly stringent wash conditions de- 
scribed in the basic and alternate protocols 
should destabilize all mismatched heterodu- 
plexes, so that hybridization signals are ob- 
tained only from sequences that are perfectly 
homologous to the probe. For DNA and RNA 
probes (as opposed to oligonucleotides), prob- 
lems with lack of specificity after the highly 
stringent wash occur only if the hybridizing 
sequences are very GC-rich, resulting in a rel- 
atively high T m . If the high-stringency wash 
does not remove all nonspecific hybridization, 
temperature can be increased by a few degrees. 
The equations above for calculating T m can be 
used as a guide for selecting the correct tem- 
perature for the final wash, but trial and error 
is more reliable. Note that a membrane that has 
been autoradiographed can be rewashed at a 
higher stringency and put back to expose again, 
the only limitation being decay of the label and 
the need for a longer exposure the second time. 
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Table 2.TO3 High-Salt Solutions Used in Hybridization Analysis 



Stock solution 


Composition 


20x SSC 


3.0 M NaCl/0.3 M trisodium citrate 


20x SSPE fl 


3.6 M NaCl/0.2 M NaH 2 PO4/a02 M EDTA, pH 7.7 


Phosphate solution 6 


1 M NaHP0 4 , pH 7.2 C 



a SSC may be replaced with the same concentration of SSPE in all protocols. 



*Prehybridize and hybridize with 0.5 M NaHP0 4 (pH 7.2)/l mM EDTA/7% SDS [or 50% forma- 
mide/0.25 M NaHP0 4 (pH 7.2)/0.25 M NaCl/i mM EDTA/7% SDS]; perform moderate-stringency 
wash in 40 mM NaHP0 4 (pH 7.2)/l mM EDTA/5% SDS; perform high-stringency wash in 40 mM 
NaHP0 4 (pH 7.2)/l mM EDTA/1% SDS. 

c Dissolve 134 g Na 2 HP0 4 -7H 2 0 in 1 liter water, then add 4 ml 85% H 3 P0 4 . The resulting solution 
is 1 M Na + , pH 7.2. 
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Designing washes for heterologous hybrid- 
ization. Calculations of T m become more criti- 
cal if heterologous probing is being attempted. 
If the aim is to identify sequences that are 
merely related, not identical, to the probe (e.g., 
members of a multigene family, or a similar 
gene in a second organism), then it is useful to 
have an idea of the degree of mismatching that 
will be tolerated by a "moderate-" or "low-" 
stringency wash. The best way to approach this 
is to first establish the lowest temperature at 
which only homologous hybridization occurs 
with a particular SSC concentration. Then as- 
sume that 1% mismatching results in a 1°C 
decrease in the T m (Bonner et al., 1973) and 
reduce the temperature of the final wash ac- 
cordingly (for example, if sequences with 
>90% similarity with the probe are being 
sought, decrease the final wash temperature by 
10°C). If the desired degree of mismatching 
results in a wash temperature of <45°C, then it 
is best to increase the SSC concentration so that 
a higher temperature can be used. Doubling the 
SSC concentration results in a~17°C increase 
inT m , so washes at 45°C in 0.1 x SSC and 62°C 
in 0.2x SSC are roughly equivalent. Note that 
in these extreme cases it may also be necessary 
to reduce the hybridization temperature to as 
low as 45°C (aqueous solution) or 32°C (for- 
mamide solution). 

This approach sometimes works ex- 
tremely well (as shown when the heterolo- 
gous targets are eventually sequenced), but 
the assumption that a 1% degree of mis- 
matching reduces the T m of a heteroduplex by 
1°C is very approximate. Base composition 
and mismatch distribution influence the ac- 
tual change in T m , which can be anything 
between 0.5° and 1.5°C per 1% mismatch 
(Hyman et al., 1973). Unfortunately trial and 
error is the only alternative to the "rational" 
approach described here. 



Other parameters relevant to 
hybridization analysis 

Length of prehybridization and hybridiza- 
tion incubations. The protocols recommend 
prehybridization for 3 hr with nitrocellulose 
and 15 min for nylon membranes. Inadequate 
prehybridization can lead to high backgrounds, 
so these times should not be reduced. They can, 
however, be extended without problem. 

Hybridizations are usually carried out over- 
night. This is a rather sloppy aspect of the 
procedure, because time can have an important 
influence on the result, especially if, as de- 
scribed above, an excess amount of a single- 
stranded probe is being used. The difficulties 
in assigning values to the parameters needed to 
calculate optimum hybridization time has led 
to the standard "overnight" incubation, which 
in fact is suitable for most purposes. The ex- 
ception is when hybridization is being taken to 
its limits, for instance in detection of single- 
copy sequences in human DNA, when longer 
hybridization times (up to 24 hr) may improve 
sensitivity if a single-stranded probe is being 
used. Note that this does not apply to double- 
stranded probes, as gradual reannealing results 
in only minimal amounts of a double-stranded 
probe being free to hybridize after -8 hr of 
incubation. 

Formamide hybridization buffers. Form am- 
ide destabilizes nucleic acid duplexes, reduc- 
ing the T m by an average of 0.6°C per 1% 
formamide for a DNA-DNA hybrid (Meinkoth 
and Wahl, 1984) and rather less for a DNA- 
RNA hybrid (Casey and Davidson, 1977; 
Kafatos et al., 1979). It can be used at 50% 
concentration in the hybridization solution, re- 
ducing the T m so that the incubation can be 
carried out at a lower temperature than needed 
with an aqueous solution. Originally formam- 
ide was used with nitrocellulose membranes as 
a means of prolonging their lifetime, as the 
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lower hybridization tempen^Jresults in less 
removal of target DNA from the matrix. More 
recently formamide has found a second use in 
reduction of heterologous background hybrid- 
ization with RNA probes. RNA-DNA hybrids 
are relatively strong, and heterologous du- 
plexes remain stable even at high temperatures. 
The destabilizing effect of formamide is there- 
fore utilized to maximize the discrimination 
between homologous and heterologous hybrid- 
ization with RNA probes. 

Formamide probably confers no major ad- 
vantage on DNA-DNA hybridization with a 
nylon membrane. In fact it introduces two prob- 
lems, the hazardous nature of the chemical 
itself, and an apparent reduction in hybridiza- 
tion rate. The latter point is controversial 
(Hutton, 1977), but for equivalent sensitivity a 
formamide hybridization reaction usually has 
to incubate for longer than an aqueous one. 

Alternatives to SSC. Although SSC has been 
used in hybridization solutions for many years, 
there is nothing sacrosanct about the formula- 
tion, and other salt solutions can be employed 
(Table 2.10.3). There is little to choose between 
these alternatives. SSPE and phosphate solu- 
tions have a greater buffering power and may 
confer an advantage in formamide hybridiza- 
tion solutions. Alternatively, the buffering 
power of SSC can be increased by adding 0.3% 
(w/v) tetrasodium pyrophosphate. 

Probe length. Probe length has a major in- 
fluence on the rate of duplex formation in 
solution hybridization (Wetmur and Davidson, 
1986), but the effect is less marked when the 
target DNA is immobilized. In membrane hy- 
bridization a more important factor is the spec- 
ificity of the probe. The probe should never be 
too long (>1000 bp), as this increases the 
chance of heterologous duplexes remaining 
stable during a high-stringency wash. Neither 
should the probe contain extensive vector se- 
quences, as these can hybridize to their own 
target sites, wrecking the specificity of the 
experiment. 

Mechanics of hybridization. Traditionally 
hybridization has been carried out in plastic 
bags. This technique is messy, radiochemical 
spills being almost unavoidable, and can lead 
to detrimental contact effects if too many mem- 
branes are hybridized in a single bag. Hybrid- 
ization incubators are now available from a 
number of companies and are recommended as 
a distinct advance over the plastic bag technol- 
ogy. Rotation of the hybridization tube results 
in excellent mixing, reducing hot spots caused 
by bubbles and dust and leading to very evenly 



hybridized membranes. uHFquality results 
are possible even when ten or more minigel 
Southerns are hybridized in a single 8.5 x 
3.0-inch tube. 

If bags are used, they should be of stiff 
plastic to prevent the sides collapsing on to the 
membrane, which will lead to high back- 
ground. The volume of hybridization solution 
should be sufficient to fill the bag, and no more 
than two membranes should be hybridized in 
each bag. 

Troubleshooting 

Problems in blotting and hybridization re- 
veal themselves when the autoradiograph is 
developed. Aguide to the commonest problems 
and how to solve them is given in Table 2. 10.4 
(based on Dyson, 1991). 

A particularly troublesome problem is high 
background signal across the entire membrane. 
This is due to the probe attaching to nucleic 
acid binding sites on the membrane surface, the 
same sites that bind DNA during the blotting 
procedure. Prehybridization/hybridization so- 
lutions contain reagents that block these sites 
and hence reduce background hybridization. 
The most popular blocking agent is Denhardt 
solution, which contains three polymeric com- 
pounds (Ficoll, polyvinylpyrrolidone, and 
BSA) that compete with nucleic acids for the 
membrane-binding sites. The formulations 
used in the basic and alternate protocols also 
include denatured salmon sperm DNA (any 
complex DNA that is nonhomologous with the 
target is acceptable) which also competes with 
the probe for the membrane sites. Blocking 
agents are included in the prehybridization so- 
lution to give them a head start over the probe. 
With a nylon membrane, the blocking agents 
may have to be left out of the hybridization 
solution, as they can interfere with the probe- 
target interaction. When the membranes are 
washed, the Denhardt solution and salmon 
sperm DNA are replaced with SDS, which acts 
as a blocking agent at concentrations >1%. 

Other blocking agents can also be used 
(Table 2.10.5). With DNA blots, the main al- 
ternatives to Denhardt are heparin (Singh and 
Jones, 1984) and milk powder (BLOTTO; 
Johnson et al., 1984), although Denhardt is 
generally more effective, at least with nylon 
membranes. Note that BLOTTO contains 
RNases and so can be used only in DNA-DNA 
hybridizations. With an RNA probe, denatured 
salmon sperm DNA is sometimes replaced by 
100 |xg/ml yeast tRNA, which has the advan- 
tage that it does not need to be sheared before 
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Problem 



Possible cause 



Solution 



Poor signal 



Probe specific activity too low 
Inadequate depurination 
Inadequate transfer buffer 



Not enough target DNA 

Poor immobilization of DNA 
Transfer time too short 
Inefficient transfer system 
Probe concentration too low 



Incomplete denaturation of probe 

Incomplete denaturation of target 
DNA 



Blocking agents interfering with the 
target-probe interaction 

Final wash was too stringent 



Hybridization temperature too low 
with an RN A probe 

Hybridization time too short 

Inappropriate membrane 



Check labeling protocol if specific 
activity is <10 dpmAtg. 

Check depurination if transfer of DNA 
>5 kb is poor. 

1 . Check that 20x SSC has been used as 
the transfer solution if small DNA 
fragments are retained inefficiently when 
transferring to nitrocellulose. 

2. With some brands of nylon membrane, 
add 2 mM Sarkosyl to the transfer buffer. 

3. Try alkaline blotting to a positively 
charged nylon membrane. 

Refer to text for recommendations 
regarding amount of target DNA to load 
per blot. 

See recommendations in UN1T2.9A 
commentary. 

See recommendations in UN1T2.9A 
commentary. 

Consider vacuum blotting as an 
alternative to capillary transfer. 

1. Check that the correct amount of DNA 
has been used in the labeling reaction. 

2. Check recovery of the probe after 
removal of unincorporated nucleotides. 

3. Use 10% dextran sulfate in the 
hybridization solution. 

4. Change to a single-stranded probe, as 
reannealing of a double-stranded probe 
reduces its effective concentration to zero 
after hybridization for 8 nr. 

Denature as described in the protocols. 

When dot or slot blotting, use the double 
denaturation methods described in UNIT 
2.9B, or blot onto positively charged 
nylon. 

If using a nylon membrane, leave the 
blocking agents out of the hybridization 
solution. 

Use a lower temperature or higher salt 
concentration. If necessary, estimate T m 
as described in unit 6.4. 

Increase hybridization temperature to 
65°C in the presence of formamide (see 
alternate protocol). 

If using formamide with a DNA probe, 
increase the hybridization time to 24 hr. 

Check the target molecules are not too 
short to be retained efficiently by the 
membrane type (see Table 2.9.1). 
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Table 2.10.4 Troubleshooting Guide for DNA Blotting and Hybridization Analysis", continued 



Problem 



Possible cause 



Solution 



Spotty background 



Patchy or generally 
high background 



Problems with electroblotting 



Unincorporated nucleotides not 
removed from labeled probe 
Particles in the hybridization buffer 
Agarose dried on the membrane 

Baking or UV cross-linking when 
membrane contains high salt 

Insufficient blocking agents 

Part of the membrane was allowed 
to dry out during hybridization or 
washing 

Membranes adhered during 
hybridization or washing 



Bubbles in a hybridization bag 

Walls of hybridization bag collapsed 

on to membrane 

Not enough wash solution 

Hybridization temperature too 
low with an RN A probe 

Formamide needs to be deionized 



Labeled probe molecules are too 
short 



Probe concentration too high 
Inadequate prehybridization 

Probe not denatured 
Inappr op riate membrane type 

Hybridization with dextran sulfate 



Make sure no bubbles are trapped in the 
filter-paper stack. Soak the filter papers 
thoroughly in TBE before assembling the 
blot. Used uncharged rather than charged 
nylon. 

Follow protocols described in UNIT 3.4. 

Filter the relevant solutions). 

Rinse membrane in 2x SSC after blotting. 

Rinse membrane in 2x SSC after blotting. 

See text for of discussion of 
extra/alternative blocking agents. 

Avoid by increasing the volume of 
solutions if necessary. 

Do not hybridize too many membranes at 
once (ten minigel blots for a 
hybridization tube, two for a bag is 
maximum). 

If using a bag, fill completely so there are 
no bubbles. 

Use a stiff plastic bag; increase volume of 
hybridization solution. 

Increase volume of wash solution to 2 
ml/10 cm 2 of membrane. 

Increase hybridization temperature to 
65°C in the presence of formamide (see 
alternate protocol). 

Although commercial formamide is 
usually satisfactory, background may be 
reduced by deionizing immediately 
before use. 

1 . Use a 32 P-labeled probe as soon as 
possible after labeling, as radiolysis can 
result in fragmentation. 

2. Reduce amount of DNase I used in 
nick translation (UN1T3S). 

Check that the correct amount of DNA 
has been used in the labeling reaction. 

Prehybridize for at least 3 hr with 
nitrocellulose or 15 min for nylon. 

Denature as described in the protocols. 

If using a nonradiocative label, check 
that the membrane is compatible with the 
detection system. 

Dextran sulfate sometimes causes 
background hybridization. Place the 
membrane between Schleicher and 
Schuell no. 589 WH paper during 
hybridization, and increase volume of 
hybridization solution (including dextran 
sulfate) by 2.5%. 
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Table 2.10.4 Troubleshooting Guide for DNA Blotting and Hybridization Analysis*, continued 



Problem 



Possible cause 



Solution 



Extra bands 



Nonspecific 
background in one 
or more tracks 



Cannot remove probe 
after hybridization 

Decrease in signal 
intensity when 
reprobed 



Not enough SDS in wash solutions 
Final wash was not stringent enough 



Probe contains nonspecific 
sequences (e.g. t vector DNA) 

Target DNA is not completely 
restriction digested 

Formamide not used with an RNA 
probe 

Probe is contaminated with genomic 
DNA 



Insufficient blocking agents 

Final wash did not approach the 
desired stringency 

Probe too short 



Membrane dried out after 
hybridization 

Poor retention of target DNA during 
probe stripping 



Check the solutions are made up 
correctly. 

Use a higher temperature or lower salt 
concentration. If necessary, estimate T m 
as described in UNIT 6.4. 

Purify shortest fragment that contains the 
desired sequence. 

Check the restriction digestion (tw/Ti.;). 

RNA-DNA hybrids are relatively strong 
but are destabilized if formamide is used 
in the hybridization solution. 

Check purification of probe DNA. The 
problem is more severe when probes are 
labeled by random printing. Change to 
nick translation 

See text for of discussion of 
extra/alternative blocking agents. 

Use a higher temperature or lower salt 
concentration. If necessary, estimate T m 
as described in UNIT 6.4. 

Sometimes a problem with probes 
labeled by random priming. Change to 
nick translation 

Make sure the membrane is stored moist 
between hybridization and stripping. 

1. Check calibration of UV source if 
cross-linking on nylon. 

2. Use a less harsh stripping method 
(support protocol). 



'Based on Dyson (1991). 

6 Within each category, possible causes are listed in decreasing order of likelihood 



use. If a cDNA clone is used as the probe, or 
for the in vitro synthesis of an RNA probe, then 
blockage of sites with high affinity forpoly(A) + 
sequences often reduces background. This is 
achieved by using 10 |Xg/ml of poiy(A) DNA 
as the blocking agent 

Anticipated Results 

Using either a nitrocellulose or nylon mem- 
brane and a probe labeled to > 1 0 8 dpm/^ig, there 
should be no difficulty in detecting 10 pg of a 
single copy sequence in human DNA after 24 
hr autoradiography. 

Time Considerations 

The hybridization experiment can be com- 
pleted in 24 hr, the bulk of this being taken up 



by the overnight incubation. Prehybridiza- 
tion takes 3 hr for a nitrocellulose membrane 
or 15 min for nylon. Post-hybridization 
washing to high stringency can usually be 
completed in 1.5 hr. If a single-copy se- 
quence in human DNA is being probed, the 
hybridization step may be extended to 24 hr, 
with a concomitant increase in the length of 
the experiment as a whole. 

The length of time needed for autoradiogra- 
phy depends on the abundance of the target 
sequences in the blotted DNA. Adequate expo- 
sure can take anything from overnight to sev- 
eral days. 
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301 gcaatcaccatgggagatcaatctggctctagaaatggaccacaacagaa 350 

I III I I HI 

1 T T T T T TGGAT T AAC AAGAGAAAGAGT TT AT T AAT TGT GC ATAGT 4 4 

351 gtacgtttcaggagcccagaatgcctgggatatgttctctgatgagctgt 400 

I I I I I I I I I I I IN I I I I I M 

4 5 GCATGTAACAACACAGGGGAGTCCAGAGATTAGTAACTCAAAAGTTTGGT 94 

401 cacagaaacacatcactactggttctggtgacctcaatgacatacttggt 450 

II II I I I I I I I I I II I I II 

95 TAGATTTGGAGCTGGGCACAGGTTTTCACACTCATAATCCCAGCACT . . T 142 

451 ggcgggattcactgcaaagaagttactgagatcggtggcgtcccaggggt 500 

II II III I I I I II I I I I II 

143 AGGGAAGCCGACGTAGGAGGATCACTTGAGGTCAGAAGTTTGAGACCAGT 192 

501 tggtaaaactcaactggggattcaactagcaatcaatgtacaaatcccag 550 

I Ml I I I I I I I I I I I I 

193 CTGGCCAACATGATGAAACTCTGTCTCTACTAAAAATACAAAAATTAGCC 242 

551 tggaatgtggtggccttggtgggaaagcagtttatatagatacagagggc 600 
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I I I I I I II II I I I I I I I III Ml 

243 AGGTATGGTGGCACGTGCCTGTATTCGCAGCTCCCAGCTACTCAGGAGGC 2 92 

601 agtttcatggttgaacgtgtctaccagattgctgaagggtgtattaggga 650 

I II I I I III I I I I I I I I I I I I I I I I I 
293 TGAGTCA . GGAGAATCGCTTGAACCTGGGAGGTGAAGGTCGCAGTAAGCC 341 

651 catactggagcactttccgcacagccatgagaagtcctcttctgtccaaa 7 00 

llll I I I I II II II 
342 AAGATTGCGCCACTGCACTCCAGCCCGGGCG 37 2 

701 aacaattacagcctgagcgtttcctggcggatatctattacttccggata 750 

I I I I I I II I I I M I I I I I I ! I I 

373 GTAGAGCCAGA. . . . TTCCTCTCCCTTGTGTTTTTCTGC 4 07 

751 tgcagttacaccgaacaaattgcagtcataaactacatggagaagttcct 800 

II I I I I I I I I I I I I I I I I I I I 

4 08 T AT AAGC T G AAG G T G C T G AAT G C AGGC AG TAG C AAG G T C T ACC AC T C TAT 4 57 

801 cagagagcataaagatgtgcgtatagttattattgatagtgttactttcc 8 50 

I I I I I I I I I I I llll I I I I I I I I 

4 58 CA. . . ACCATAAAACTTCCCTCTGTATCAATAAAAACTGCTTCACCTGCC 504 

851 actttcgacaagattttgaagatctggcactgaggaccagagtgctaagt 900 

Ml II II II I M I I I II I I 

505 ACTCCTCCAAACCATTCTGGTATCTGCACATCTACTGCCAACTGCATACA 554 

901 ggattatcattgaagttaatgaagattgcaaagacatataacttggcagt 950 

II I I I M I I I I I I I i I I I 

555 TAATTGTGTT TTTCCAACACCTGGTGCACCACAAATTTCTGTTGT 599 

951 tgtcttgttgaaccaagtcactactaaatttacagaagggtcatttcaat 1000 

llll i I I I I I I I I II II 

600 T T T CAT T AAGGG C AC T C C AC C C C C AAG AAT AT CAT C TAG T G C T G AAC AG A 64 9 

1001 tgactcttgctctaggtgacagctggtcccactcatgcacgaaccggttg 1050 

II II Mill I I II I I 

650 AAGTGATTATGAAAGCCTGGGTATGCTCCTGCTCAAGAAGTTC 692 
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351 gtacgtttcaggagcccagaatgcctgggatatgttctctgatgagctgt 4 00 

II I I I I I I I 

1 G AAC T T C T T GAG C AG GAG CAT A 22 

401 cacagaaacacatcactactggttctggtgacctcaatgacatacttggt 4 50 
I I I I I I I I III II I II I I I I I I I I I I I 

23 CCCAGGCTTTCATAATCACTTTCTGTTCAGCACTAGATGATATTCTTGGG 72 

451 ggcgggattcactgcaaagaagttactgagatcggtggcgtcccaggggt 500 
I I I I I I I I I II I I I I I I I I I I I I I I I I I I 

73 GGTGGAGTGCCCTTAATGAAAACAACAGAAATTTGTGGTGCACCAGGTGT 122 

501 tggtaaaactcaactggggattcaactagcaatcaatgtacaaatcccag 550 

I M I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I 

123 TGGAAAAACACAATTATGTATGCAGTTGGCAGTAGATGTGCAGATACCAG 172 

551 tggaatgtggtggccttggtgggaaagcagtttatatagatacagagggc 600 

I I I I I I II II I I I I I I I I I I I I I M I I I I I I I I 

173 AATGGTTTGGAGGAGTGGCAGGTGAAGCAGTTTTTATTGATACAGAGGGA 222 



601 agtttcatggttgaacgtgtctaccagattgctgaagggtgtattaggga 650 



1 1 1 1 1 1 1 1 1 1 1 1 1 III I 1 1 1 1 1 I 1 1 1 1 1 II 

223 AGTTTTATGGTTGATAGAGTGGTAGACCTTGCTACTGCCTGCATTCAGCA 272 

651 catactg. gagcactttccgcacagccatgagaagtcctcttctgtccaa 699 

I I I I I, III I I I I I I I I I 

273 CCTTCAGCTTATAGCAGAAAAACACAAGGGAGAGGAATCTGGCTCTACCG 322 

700 aaacaattacagcctgagcgtttcctggcggatatctattacttccggat 749 

I I I I I I I I I I I I I I I 

323 CCCGGGCTGGAGTGCAGTGGCGCAATCTTGGCTTACTGCGACCTTCACCT 372 

750 atgcagttacaccgaacaaattgcagtcataaactacatggagaagttcc 799 

I II I I I I I I I I I I I I I 
373 CCCAGGTTCAAGCG . . . ATTCTCCTGACTCAGCCTCCTGAGTAGCTGGGA 419 

800 tcagagagcataaagatgtgcgtatagttattattgatagtgttactttc 84 9 

I I II II I I I I I I I Mi I I I I I I 

420 GCTGCGAATACAGGCACGTGCCACCATACCTGGCTAATTTTTGTATTTTT 4 69 

850 cactttcgacaagattttgaagatctggcactgaggaccagagtgctaag 899 

I I I I I I I I I III I 
470 AG TAG AG AC AG AGT T T CAT CAT GT T G G CC AG AC T GG T C T C AAAC T T C T G A 519 

900 tggattatcattgaagttaatgaagattgcaaagacatataacttggcag 94 9 

III I I I I I I I I III 

520 CCTCAAGTGATCCTCCTACGTCGGCTTCCCTAAGTGCTGGGATTATGAGT 569 

950 ttgtcttgttgaaccaagtcactactaaatttacagaagggtcatttcaa 999 

II I I I I I I I I I I I I I I I I I II 

570 GTGAAAACCTGTGCCCAG . . . CTCCAAATCTAACCAAACTTTTGAGTTAC 616 

1000 ttgactcttgctctaggtgacagctggtcccactcatgcacgaaccggtt 1049 

I I I I I I II I I I I I I I I 

617 TAATCTCTGGACTCCCCTGTGTTGTTACATGCACTATGCACAATTAATAA 666 

1050 gattctgcactggaatgggaacgaacgatacgcacatcttgataagtctc 1099 

I I I I I I III 
667 ACTCTTTCTCTTGTTAATCCAAAAAA 692 



